Patch 2ed37183abb7 ("netfilter: flowtable: separate replace, destroy and
stats to different workqueues") splits the workqueue per event type. Add
a mutex to serialize updates.
Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading support") Reported-by: Frank Wunderlich <frank-w@public-files.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethernet: mtk_eth_soc: fix undefined reference to `dsa_port_from_netdev'
Caused by:
CONFIG_NET_DSA=m
CONFIG_NET_MEDIATEK_SOC=y
mtk_ppe_offload.c:undefined reference to `dsa_port_from_netdev'
Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading support") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Add vlan match and pop actions to the flowtable offload,
patches from wenxu.
2) Reduce size of the netns_ct structure, which itself is
embedded in struct net Make netns_ct a read-mostly structure.
Patches from Florian Westphal.
3) Add FLOW_OFFLOAD_XMIT_UNSPEC to skip dst check from garbage
collector path, as required by the tc CT action. From Roi Dayan.
4) VLAN offload fixes for nftables: Allow for matching on both s-vlan
and c-vlan selectors. Fix match of VLAN id due to incorrect
byteorder. Add a new routine to properly populate flow dissector
ethertypes.
5) Missing keys in ip{6}_route_me_harder() results in incorrect
routes. This includes an update for selftest infra. Patches
from Ido Schimmel.
6) Add counter hardware offload support through FLOW_CLS_STATS.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
DENG Qingfang [Sat, 17 Apr 2021 07:29:04 +0000 (15:29 +0800)]
net: ethernet: mediatek: fix a typo bug in flow offloading
Issue was traffic problems after a while with increased ping times if
flow offload is active. It turns out that key_offset with cookie is
needed in rhashtable_params but was re-assigned to head_offset.
Fix the assignment.
Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading support") Signed-off-by: DENG Qingfang <dqfext@gmail.com> Tested-by: Frank Wunderlich <frank-w@public-files.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Sat, 17 Apr 2021 07:09:24 +0000 (15:09 +0800)]
net: hns3: change the value of the SEPARATOR_VALUE macro in hclgevf_main.c
The SEPARATOR_VALUE macro is used as separator when getting
the register value, but the value of this macro is different
between pf and vf, it is a bit confusing for the user, so
synchronize the value of vf with pf.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Sat, 17 Apr 2021 07:09:22 +0000 (15:09 +0800)]
net: hns3: remove a duplicate pf reset counting
When enter suspend mode the counter of pf reset will be increased
twice, since both hclge_prepare_general() and hclge_prepare_wait()
increase this counter. So remove the duplicate counting in
hclge_prepare_general().
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Randy Dunlap [Sat, 17 Apr 2021 06:55:54 +0000 (23:55 -0700)]
net: xilinx: drivers need/depend on HAS_IOMEM
kernel test robot reports build errors in 3 Xilinx ethernet drivers.
They all use ioremap functions that are only available when HAS_IOMEM
is set/enabled. If it is not enabled, they all have build errors,
so make these 3 drivers depend on HAS_IOMEM.
ld: drivers/net/ethernet/xilinx/xilinx_emaclite.o: in function `xemaclite_of_probe':
xilinx_emaclite.c:(.text+0x9fc): undefined reference to `devm_ioremap_resource'
ld: drivers/net/ethernet/xilinx/xilinx_axienet_main.o: in function `axienet_probe':
xilinx_axienet_main.c:(.text+0x942): undefined reference to `devm_ioremap_resource'
ld: drivers/net/ethernet/xilinx/ll_temac_main.o: in function `temac_probe':
ll_temac_main.c:(.text+0x1283): undefined reference to `devm_platform_ioremap_resource_byname'
ld: ll_temac_main.c:(.text+0x13ad): undefined reference to `devm_of_iomap'
ld: ll_temac_main.c:(.text+0x162e): undefined reference to `devm_platform_ioremap_resource'
Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Cc: Gary Guo <gary@garyguo.net> Cc: Zhang Changzhong <zhangchangzhong@huawei.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: stable@vger.kernel.org Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Apr 2021 22:31:45 +0000 (15:31 -0700)]
Merge branch 'enetc-flow-control'
Vladimir Oltean says:
====================
Flow control for NXP ENETC
This patch series contains logic for enabling the lossless mode on the
RX rings of the ENETC, and the PAUSE thresholds on the internal FIFO
memory.
During testing it was found that, with the default FIFO configuration,
a sender which isn't persuaded by our PAUSE frames and keeps sending
will cause some MAC RX frame errors. To mitigate this, we need to ensure
that the FIFO never runs completely full, so we need to fix up a setting
that was supposed to be configured well out of reset. Unfortunately this
requires the addition of a new mini-driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Fri, 16 Apr 2021 23:42:25 +0000 (02:42 +0300)]
net: enetc: add support for flow control
In the ENETC receive path, a frame received by the MAC is first stored
in a 256KB 'FIFO' memory, then transferred to DRAM when enqueuing it to
the RX ring. The FIFO is a shared resource for all ENETC ports, but
every port keeps track of its own memory utilization, on RX and on TX.
There is a setting for RX rings through which they can either operate in
'lossy' mode (where the lack of a free buffer causes an immediate
discard of the frame) or in 'lossless' mode (where the lack of a free
buffer in the ring makes the frame stay longer in the FIFO).
In turn, when the memory utilization of the FIFO exceeds a certain
margin, the MAC can be configured to emit PAUSE frames.
There is enough FIFO memory to buffer up to 3 MTU-sized frames per RX
port while not jeopardizing the other use cases (jumbo frames), and
also not consume bytes from the port TX allocations. Also, 3 MTU-sized
frames worth of memory is enough to ensure zero loss for 64 byte packets
at 1G line rate.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Fri, 16 Apr 2021 23:42:23 +0000 (02:42 +0300)]
net: enetc: add a mini driver for the Integrated Endpoint Register Block
The NXP ENETC is a 4-port Ethernet controller which 'smells' to
operating systems like 4 distinct PCIe PFs with SR-IOV, each PF having
its own driver instance, but in fact there are some hardware resources
which are shared between all ports, like for example the 256 KB SRAM
FIFO between the MACs and the Host Transfer Agent which DMAs frames to
DRAM.
To hide the stuff that cannot be neatly exposed per port, the hardware
designers came up with this idea of having a dedicated register block
which is supposed to be populated by the bootloader, and contains
everything configuration-related: MAC addresses, FIFO partitioning, etc.
When a port is reset using PCIe Function Level Reset, its defaults are
transferred from the IERB configuration. Most of the time, the settings
made through the IERB are read-only in the port's memory space (if they
are even visible), so they cannot be modified at runtime.
Linux doesn't have any advanced FIFO partitioning requirements at all,
but when reading through the hardware manual, it became clear that, even
though there are many good 'recommendations' for default values, many of
them were not actually put in practice on LS1028A. So we end up with a
default configuration that:
(a) does not have enough TX and RX byte credits to support the max MTU
of 9600 (which the Linux driver claims already) properly (at full speed)
(b) allows the FIFO to be overrun with RX traffic, potentially
overwriting internal data structures.
The last part sounds a bit catastrophic, but it isn't. Frames are
supposed to transit the FIFO for a very short time, but they can
actually accumulate there under 2 conditions:
(a) there is very severe congestion on DRAM memory, or
(b) the RX rings visible to the operating system were configured for
lossless operation, and they just ran out of free buffers to copy
the frame to. This is what is used to put backpressure onto the MAC
with flow control.
So since ENETC has not supported flow control thus far, RX FIFO overruns
were never seen with Linux. But with the addition of flow control, we
should configure some registers to prevent this from happening. What we
are trying to protect against are bad actors which continue to send us
traffic despite the fact that we have signaled a PAUSE condition. Of
course we can't be lossless in that case, but it is best to configure
the FIFO to do tail dropping rather than letting it overrun.
So in a nutshell, this driver is a fixup for all the IERB default values
that should have been but aren't.
The IERB configuration needs to be done _before_ the PFs are enabled.
So every PF searches for the presence of the "fsl,ls1028a-enetc-ierb"
node in the device tree, and if it finds it, it "registers" with the
IERB, which means that it requests the IERB to fix up its default
values. This is done through -EPROBE_DEFER. The IERB driver is part of
the fsl_enetc module, but is technically a platform driver, since the
IERB is a good old fashioned MMIO region, as opposed to ENETC ports
which pretend to be PCIe devices.
The driver was already configuring ENETC_PTXMBAR (FIFO allocation for
TX) because due to an omission, TXMBAR is a read/write register in the
PF memory space. But the manual is quite clear that the formula for this
should depend upon the TX byte credits (TXBCR). In turn, the TX byte
credits are only readable/writable through the IERB. So if we want to
ensure that the TXBCR register also has a value that is correct and in
line with TXMBAR, there is simply no way this can be done from the PF
driver, access to the IERB is needed.
I could have modified U-Boot to fix up the IERB values, but that is
quite undesirable, as old U-Boot versions are likely to be floating
around for quite some time from now.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Fri, 16 Apr 2021 23:42:21 +0000 (02:42 +0300)]
net: enetc: create a common enetc_pf_to_port helper
Even though ENETC interfaces are exposed as individual PCIe PFs with
their own driver instances, the ENETC is still fundamentally a
multi-port Ethernet controller, and some parts of the IP take a port
number (as can be seen in the PSFP implementation).
Create a common helper that can be used outside of the TSN code for
retrieving the ENETC port number based on the PF number. This is only
correct for LS1028A, the only Linux-capable instantiation of ENETC thus
far.
Note that ENETC port 3 is PF 6. The TSN code did not care about this
because ENETC port 3 does not support TSN, so the wrong mapping done by
enetc_get_port for PF 6 could have never been hit.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ethtool: ioctl: Fix out-of-bounds warning in store_link_ksettings_for_user()
Fix the following out-of-bounds warning:
net/ethtool/ioctl.c:492:2: warning: 'memcpy' offset [49, 84] from the object at 'link_usettings' is out of the bounds of referenced subobject 'base' with type 'struct ethtool_link_settings' at offset 0 [-Warray-bounds]
The problem is that the original code is trying to copy data into a
some struct members adjacent to each other in a single call to
memcpy(). This causes a legitimate compiler warning because memcpy()
overruns the length of &link_usettings.base. Fix this by directly
using &link_usettings and _from_ as destination and source addresses,
instead.
This helps with the ongoing efforts to globally enable -Warray-bounds
and get us closer to being able to tighten the FORTIFY_SOURCE routines
on memcpy().
Link: https://github.com/KSPP/linux/issues/109 Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Apr 2021 22:20:35 +0000 (15:20 -0700)]
Merge branch 'nh-flushing'
Ido Schimmel says:
====================
nexthop: Support large scale nexthop flushing
Patch #1 fixes a day-one bug in the nexthop code and allows "ip nexthop
flush" to work correctly with large number of nexthops that do not fit
in a single-part dump.
Patch #2 adds a test case.
Targeting at net-next since this use case never worked, the flow is
pretty obscure and such a large number of nexthops is unlikely to be
used in any real-world scenario.
selftests: fib_nexthops: Test large scale nexthop flushing
Test that all the nexthops are flushed when a multi-part nexthop dump is
required for the flushing.
Without previous patch:
# ./fib_nexthops.sh
TEST: Large scale nexthop flushing [FAIL]
With previous patch:
# ./fib_nexthops.sh
TEST: Large scale nexthop flushing [ OK ]
Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
nexthop: Restart nexthop dump based on last dumped nexthop identifier
Currently, a multi-part nexthop dump is restarted based on the number of
nexthops that have been dumped so far. This can result in a lot of
nexthops not being dumped when nexthops are simultaneously deleted:
# ip nexthop | wc -l
65536
# ip nexthop flush
Dump was interrupted and may be inconsistent.
Flushed 36040 nexthops
# ip nexthop | wc -l
29496
Instead, restart the dump based on the nexthop identifier (fixed number)
of the last successfully dumped nexthop:
# ip nexthop | wc -l
65536
# ip nexthop flush
Dump was interrupted and may be inconsistent.
Flushed 65536 nexthops
# ip nexthop | wc -l
0
Reported-by: Maksym Yaremchuk <maksymy@nvidia.com> Tested-by: Maksym Yaremchuk <maksymy@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
netfilter: nftables: counter hardware offload support
This patch adds the .offload_stats operation to synchronize hardware
stats with the expression data. Update the counter expression to use
this new interface. The hardware stats are retrieved from the netlink
dump path via FLOW_CLS_STATS command to the driver.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
selftests: fib_tests: Add test cases for interaction with mangling
Test that packets are correctly routed when netfilter mangling rules are
present.
Without previous patch:
# ./fib_tests.sh -t ipv4_mangle
IPv4 mangling tests
TEST: Connection with correct parameters [ OK ]
TEST: Connection with incorrect parameters [ OK ]
TEST: Connection with correct parameters - mangling [FAIL]
TEST: Connection with correct parameters - no mangling [ OK ]
TEST: Connection check - server side [FAIL]
Tests passed: 3
Tests failed: 2
# ./fib_tests.sh -t ipv6_mangle
IPv6 mangling tests
TEST: Connection with correct parameters [ OK ]
TEST: Connection with incorrect parameters [ OK ]
TEST: Connection with correct parameters - mangling [FAIL]
TEST: Connection with correct parameters - no mangling [ OK ]
TEST: Connection check - server side [FAIL]
Tests passed: 3
Tests failed: 2
With previous patch:
# ./fib_tests.sh -t ipv4_mangle
IPv4 mangling tests
TEST: Connection with correct parameters [ OK ]
TEST: Connection with incorrect parameters [ OK ]
TEST: Connection with correct parameters - mangling [ OK ]
TEST: Connection with correct parameters - no mangling [ OK ]
TEST: Connection check - server side [ OK ]
Tests passed: 5
Tests failed: 0
# ./fib_tests.sh -t ipv6_mangle
IPv6 mangling tests
TEST: Connection with correct parameters [ OK ]
TEST: Connection with incorrect parameters [ OK ]
TEST: Connection with correct parameters - mangling [ OK ]
TEST: Connection with correct parameters - no mangling [ OK ]
TEST: Connection check - server side [ OK ]
Tests passed: 5
Tests failed: 0
Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Netfilter tries to reroute mangled packets as a different route might
need to be used following the mangling. When this happens, netfilter
does not populate the IP protocol, the source port and the destination
port in the flow key. Therefore, FIB rules that match on these fields
are ignored and packets can be misrouted.
Solve this by dissecting the outer flow and populating the flow key
before rerouting the packet. Note that flow dissection only happens when
FIB rules that match on these fields are installed, so in the common
case there should not be a penalty.
Reported-by: Michal Soltys <msoltyspl@yandex.pl> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: nftables_offload: special ethertype handling for VLAN
The nftables offload parser sets FLOW_DISSECTOR_KEY_BASIC .n_proto to the
ethertype field in the ethertype frame. However:
- FLOW_DISSECTOR_KEY_BASIC .n_proto field always stores either IPv4 or IPv6
ethertypes.
- FLOW_DISSECTOR_KEY_VLAN .vlan_tpid stores either the 802.1q and 802.1ad
ethertypes. Same as for FLOW_DISSECTOR_KEY_CVLAN.
This function adjusts the flow dissector to handle two scenarios:
1) FLOW_DISSECTOR_KEY_VLAN .vlan_tpid is set to 802.1q or 802.1ad.
Then, transfer:
- the .n_proto field to FLOW_DISSECTOR_KEY_VLAN .tpid.
- the original FLOW_DISSECTOR_KEY_VLAN .tpid to the
FLOW_DISSECTOR_KEY_CVLAN .tpid
- the original FLOW_DISSECTOR_KEY_CVLAN .tpid to the .n_proto field.
2) .n_proto is set to 802.1q or 802.1ad. Then, transfer:
- the .n_proto field to FLOW_DISSECTOR_KEY_VLAN .tpid.
- the original FLOW_DISSECTOR_KEY_VLAN .tpid to the .n_proto field.
Add CFO tracking, which stands for central frequency offset tracking, to
adjust oscillator to align central frequency of connected AP. Then, it can
yield better performance.
Johannes Berg [Thu, 15 Apr 2021 13:48:46 +0000 (16:48 +0300)]
iwlwifi: pcie: don't enable BHs with IRQs disabled
After the fix from Jiri that disabled local IRQs instead of
just BHs (necessary to fix an issue with submitting a command
with IRQs already disabled), there was still a situation in
which we could deep in there enable BHs, if the device config
sets the apmg_wake_up_wa configuration, which is true on all
7000 series devices.
To fix that, but not require reverting commit 1ed08f6fb5ae
("iwlwifi: remove flags argument for nic_access"), split up
nic access into a version with BH manipulation to use most
of the time, and without it for this specific case where the
local IRQs are already disabled.
In mwl8k_probe_hw, hw->priv->txq is freed at the first time by
dma_free_coherent() in the call chain:
if(!priv->ap_fw)->mwl8k_init_txqs(hw)->mwl8k_txq_init(hw, i).
Then in err_free_queues of mwl8k_probe_hw, hw->priv->txq is freed
at the second time by mwl8k_txq_deinit(hw, i)->dma_free_coherent().
My patch set txq->txd to NULL after the first free to avoid the
double free.
The problem arises because the value of group is 5 for channel 14. The trivial
increase in the dimension of bw40_base fails as this struct must match the layout of
efuse. The fix is to add the rate as an argument to rtw_get_channel_group() and set
the group for channel 14 to 4 if rate <= DESC_RATE11M.
This patch fixes commit fa6dfe6bff24 ("rtw88: resolve order of tx power setting routines")
Fixes: fa6dfe6bff24 ("rtw88: resolve order of tx power setting routines") Reported-by: Богдан Пилипенко <bogdan.pylypenko107@gmail.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Cc: Stable <stable@vger.kernel.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210401192717.28927-1-Larry.Finger@lwfinger.net
Marek Vasut [Sat, 27 Mar 2021 23:59:32 +0000 (00:59 +0100)]
rsi: Use resume_noirq for SDIO
The rsi_resume() does access the bus to enable interrupts on the RSI
SDIO WiFi card, however when calling sdio_claim_host() in the resume
path, it is possible the bus is already claimed and sdio_claim_host()
spins indefinitelly. Enable the SDIO card interrupts in resume_noirq
instead to prevent anything else from claiming the SDIO bus first.
Fixes: 20db07332736 ("rsi: sdio suspend and resume support") Signed-off-by: Marek Vasut <marex@denx.de> Cc: Amitkumar Karwar <amit.karwar@redpinesignals.com> Cc: Angus Ainslie <angus@akkea.ca> Cc: David S. Miller <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Karun Eagalapati <karun256@gmail.com> Cc: Martin Kepplinger <martink@posteo.de> Cc: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm> Cc: Siva Rebbagondla <siva8118@gmail.com> Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210327235932.175896-1-marex@denx.de
The opening comment mark '/**' is used for highlighting the beginning of
kernel-doc comments.
There are some files in drivers/net/wireless/rsi which follow this syntax
in their file headers, i.e. start with '/**' like comments, which causes
unexpected warnings from kernel-doc.
E.g., running scripts/kernel-doc -none on drivers/net/wireless/rsi/rsi_coex.h
causes this warning:
"warning: wrong kernel-doc identifier on line:
* Copyright (c) 2018 Redpine Signals Inc."
Similarly for other files too.
Provide a simple fix by replacing such occurrences with general comment
format, i.e., "/*", to prevent kernel-doc from parsing it.
r8169: keep pause settings on interface down/up cycle
Currently, if the user changes the pause settings, the default settings
will be restored after an interface down/up cycle, and also when
resuming from suspend. This doesn't seem to provide the best user
experience. Change this to keep user settings, and just ensure that in
jumbo mode pause is disabled.
Small drawback: When switching back mtu from jumbo to non-jumbo then
pause remains disabled (but user can enable it using ethtool).
I think that's a not too common scenario and acceptable.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
- keep the ZC code, drop the code related to reinit
net/bridge/netfilter/ebtables.c
- fix build after move to net_generic
Arnd Bergmann [Tue, 23 Mar 2021 13:16:28 +0000 (14:16 +0100)]
airo: work around stack usage warning
gcc-11 with KASAN on 32-bit arm produces a warning about a function
that needs a lot of stack space:
drivers/net/wireless/cisco/airo.c: In function 'setup_card.constprop':
drivers/net/wireless/cisco/airo.c:3960:1: error: the frame size of 1512 bytes is larger than 1400 bytes [-Werror=frame-larger-than=]
Most of this is from a single large structure that could be dynamically
allocated or moved into the per-device structure. However, as the callers
all seem to have a fairly well bounded call chain, the easiest change
is to pull out the part of the function that needs the large variables
into a separate function and mark that as noinline_for_stack. This does
not reduce the total stack usage, but it gets rid of the warning and
requires minimal changes otherwise.
Arnd Bergmann [Tue, 23 Mar 2021 12:57:14 +0000 (13:57 +0100)]
wlcore: fix overlapping snprintf arguments in debugfs
gcc complains about undefined behavior in calling snprintf()
with the same buffer as input and output:
drivers/net/wireless/ti/wl18xx/debugfs.c: In function 'diversity_num_of_packets_per_ant_read':
drivers/net/wireless/ti/wl18xx/../wlcore/debugfs.h:86:3: error: 'snprintf' argument 4 overlaps destination object 'buf' [-Werror=restrict]
86 | snprintf(buf, sizeof(buf), "%s[%d] = %d\n", \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
87 | buf, i, stats->sub.name[i]); \
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/wireless/ti/wl18xx/debugfs.c:24:2: note: in expansion of macro 'DEBUGFS_FWSTATS_FILE_ARRAY'
24 | DEBUGFS_FWSTATS_FILE_ARRAY(a, b, c, wl18xx_acx_statistics)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/wireless/ti/wl18xx/debugfs.c:159:1: note: in expansion of macro 'WL18XX_DEBUGFS_FWSTATS_FILE_ARRAY'
159 | WL18XX_DEBUGFS_FWSTATS_FILE_ARRAY(diversity, num_of_packets_per_ant,
There are probably other ways of handling the debugfs file, without
using on-stack buffers, but a simple workaround here is to remember the
current position in the buffer and just keep printing in there.
Arnd Bergmann [Mon, 22 Mar 2021 10:43:34 +0000 (11:43 +0100)]
libertas: avoid -Wempty-body warning
Building without mesh supports shows a couple of warnings with
'make W=1':
drivers/net/wireless/marvell/libertas/main.c: In function 'lbs_start_card':
drivers/net/wireless/marvell/libertas/main.c:1068:37: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
1068 | lbs_start_mesh(priv);
Change the macros to use the usual "do { } while (0)" instead to shut up
the warnings and make the code a litte more robust.
The 'c2hcmd_lock' spinlock is only used to protect some __skb_queue_tail()
and __skb_dequeue() calls.
Use the lock provided in the skb itself and call skb_queue_tail() and
skb_dequeue(). These functions already include the correct locking.
Dan Carpenter [Fri, 19 Mar 2021 14:47:31 +0000 (17:47 +0300)]
wilc1000: fix a loop timeout condition
If the loop fails, the "while(trials--) {" loop will exit with "trials"
set to -1. The test for that expects it to end with "trials" set to 0
so the warning message will not be printed.
Fix this by changing from a post-op to a pre-op. This does mean that
we only make 99 attempts instead of 100 but that's okay.
Fixes: f135a1571a05 ("wilc1000: Support chip sleep over SPI") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Ajay Singh <ajay.kathat@microchip.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/YFS5gx/gi70zlIaO@mwanda
In preparation to enable -Wimplicit-fallthrough for Clang, fix
multiple warnings by replacing /* fall through */ comments with
the new pseudo-keyword macro fallthrough; instead of letting the
code fall through to the next case.
Notice that Clang doesn't recognize /* fall through */ comments as
implicit fall-through markings.
wilc1000: Bring MAC address setting in line with typical Linux behavior
Linux network drivers normally disallow changing the MAC address when
the interface is up. This driver has been different in that it allows
to change the MAC address *only* when it's up. This patch brings
wilc1000 behavior more in line with other network drivers. We could
have replaced wilc_set_mac_addr() with eth_mac_addr() but that would
break existing documentation on how to change the MAC address.
Likewise, return -EADDRNOTAVAIL (not -EINVAL) when the specified MAC
address is invalid or unavailable.
The driver so far has always disabled CRC protection. This means any
data corruption that occurrs during the SPI transfers could go
undetected. This patch adds module parameters enable_crc7 and
enable_crc16 to selectively turn on CRC7 (for command transfers) and
CRC16 (for data transfers), respectively.
The default configuration remains unchanged, with both CRC7 and CRC16
off.
The performance impact of CRC was measured by running ttcp -t four
times in a row on a SAMA5 device:
CRC7 CRC16 Throughput: Standard deviation:
---- ----- ----------- -------------------
off off 1720 +/- 48 KB/s
on off 1658 +/- 58 KB/s
on on 1579 +/- 84 KB/s
wilc1000: Introduce symbolic names for SPI protocol register
The WILC1000 protocol control register has bits for enabling the CRCs
(CRC7 for commands and CRC16 for data) and to set the data packet
size. Define symbolic names for those so the code is more easily
understood.
For CMD_SINGLE_READ and CMD_INTERNAL_READ, WILC may insert one or more
zero bytes between the command response and the DATA Start tag (0xf3).
This behavior appears to be undocumented in "ATWILC1000 USER GUIDE"
(https://tinyurl.com/4hhshdts) but we have observed 1-4 zero bytes
when the SPI bus operates at 48MHz and none when it operates at 1MHz.
This code is derived from the equivalent code of the wilc driver in
the linux-at91 repository.
Brian Norris [Thu, 25 Feb 2021 02:44:54 +0000 (18:44 -0800)]
mwifiex: don't print SSID to logs
There are a few reasons not to dump SSIDs as-is in kernel logs:
1) they're not guaranteed to be any particular text encoding (UTF-8,
ASCII, ...) in general
2) it's somewhat redundant; the BSSID should be enough to uniquely
identify the AP/STA to which we're connecting
3) BSSIDs have an easily-recognized format, whereas SSIDs do not (they
are free-form)
4) other common drivers (e.g., everything based on mac80211) get along
just fine by only including BSSIDs when logging state transitions
Additional notes on reason #3: this is important for the
privacy-conscious, especially when providing tools that convey
kernel logs on behalf of a user -- e.g., when reporting bugs. So for
example, it's easy to automatically filter logs for MAC addresses, but
it's much harder to filter SSIDs out of unstructured text.
Dan Carpenter [Wed, 14 Apr 2021 08:29:55 +0000 (11:29 +0300)]
ipw2x00: potential buffer overflow in libipw_wx_set_encodeext()
The "ext->key_len" is a u16 that comes from the user. If it's over
SCM_KEY_LEN (32) that could lead to memory corruption.
Fixes: e0d369d1d969 ("[PATCH] ieee82011: Added WE-18 support to default wireless extension handler") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Stanislav Yakovlev <stas.yakovlev@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/YHaoA1i+8uT4ir4h@mwanda
The 'c2hcmd_lock' spinlock is only used to protect some __skb_queue_tail()
and __skb_dequeue() calls.
Use the lock provided in the skb itself and call skb_queue_tail() and
skb_dequeue(). These functions already include the correct locking.
Yang Li [Wed, 31 Mar 2021 09:13:43 +0000 (17:13 +0800)]
rtlwifi: rtl8188ee: remove redundant assignment of variable rtlpriv->btcoexist.reg_bt_sco
Assigning value "3" to "rtlpriv->btcoexist.reg_bt_sco" here, but that
stored value is overwritten before it can be used.
Coverity reports this problem as
CWE563: A value assigned to a variable is never used.
drivers/net/wireless/realtek/rtlwifi/rtl8188ee/hw.c:
rtl8188ee_bt_reg_init
Colin Ian King [Sat, 27 Mar 2021 23:00:14 +0000 (23:00 +0000)]
rtlwifi: remove redundant assignment to variable err
Variable err is assigned -ENODEV followed by an error return path
via label error_out that does not access the variable and returns
with the -ENODEV error return code. The assignment to err is
redundant and can be removed.
Ping-Ke Shih [Fri, 19 Feb 2021 05:26:07 +0000 (13:26 +0800)]
rtlwifi: 8821ae: upgrade PHY and RF parameters
The signal strength of 5G is quite low, so user can't connect to an AP far
away. New parameters with new format and its parser are updated by the commit 84d26fda52e2 ("rtlwifi: Update 8821ae new phy parameters and its parser."), but
some parameters are missing. Use this commit to update to the novel parameters
that use new format.
Fixes: 84d26fda52e2 ("rtlwifi: Update 8821ae new phy parameters and its parser") Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210219052607.7323-1-pkshih@realtek.com
Merge tag 'net-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.12-rc8, including fixes from netfilter, and
bpf. BPF verifier changes stand out, otherwise things have slowed
down.
Current release - regressions:
- gro: ensure frag0 meets IP header alignment
- Revert "net: stmmac: re-init rx buffers when mac resume back"
- ethernet: macb: fix the restore of cmp registers
Previous releases - regressions:
- ixgbe: Fix NULL pointer dereference in ethtool loopback test
- ixgbe: fix unbalanced device enable/disable in suspend/resume
- phy: marvell: fix detection of PHY on Topaz switches
- make tcp_allowed_congestion_control readonly in non-init netns
- xen-netback: Check for hotplug-status existence before watching
Previous releases - always broken:
- bpf: mitigate a speculative oob read of up to map value size by
tightening the masking window
- sctp: fix race condition in sctp_destroy_sock
- sit, ip6_tunnel: Unregister catch-all devices
- netfilter: nftables: clone set element expression template
- net: geneve: check skb is large enough for IPv4/IPv6 header
- netlink: don't call ->netlink_bind with table lock held"
* tag 'net-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits)
netlink: don't call ->netlink_bind with table lock held
MAINTAINERS: update my email
bpf: Update selftests to reflect new error states
bpf: Tighten speculative pointer arithmetic mask
bpf: Move sanitize_val_alu out of op switch
bpf: Refactor and streamline bounds check into helper
bpf: Improve verifier error messages for users
bpf: Rework ptr_limit into alu_limit and add common error path
bpf: Ensure off_reg has no mixed signed bounds for all types
bpf: Move off_reg into sanitize_ptr_alu
bpf: Use correct permission flag for mixed signed bounds arithmetic
ch_ktls: do not send snd_una update to TCB in middle
ch_ktls: tcb close causes tls connection failure
ch_ktls: fix device connection close
ch_ktls: Fix kernel panic
i40e: fix the panic when running bpf in xdpdrv mode
net/mlx5e: fix ingress_ifindex check in mlx5e_flower_parse_meta
net/mlx5e: Fix setting of RS FEC mode
net/mlx5: Fix setting of devlink traps in switchdev mode
Revert "net: stmmac: re-init rx buffers when mac resume back"
...
Merge tag 'libnvdimm-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
"The largest change is for a regression that landed during -rc1 for
block-device read-only handling. Vaibhav found a new use for the
ability (originally introduced by virtio_pmem) to call back to the
platform to flush data, but also found an original bug in that
implementation. Lastly, Arnd cleans up some compile warnings in dax.
This has all appeared in -next with no reported issues.
Summary:
- Fix a regression of read-only handling in the pmem driver
- Fix a compile warning
- Fix support for platform cache flush commands on powerpc/papr"
* tag 'libnvdimm-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
libnvdimm/region: Fix nvdimm_has_flush() to handle ND_REGION_ASYNC
libnvdimm: Notify disk drivers to revalidate region read-only
dax: avoid -Wempty-body warnings
Merge tag 'cxl-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL memory class fixes from Dan Williams:
"A collection of fixes for the CXL memory class driver introduced in
this release cycle.
The driver was primarily developed on a work-in-progress QEMU
emulation of the interface and we have since found a couple places
where it hid spec compliance bugs in the driver, or had a spec
implementation bug itself.
The biggest change here is replacing a percpu_ref with an rwsem to
cleanup a couple bugs in the error unwind path during ioctl device
init. Lastly there were some minor cleanups to not export the
power-management sysfs-ABI for the ioctl device, use the proper sysfs
helper for emitting values, and prevent subtle bugs as new
administration commands are added to the supported list.
The bulk of it has appeared in -next save for the top commit which was
found today and validated on a fixed-up QEMU model.
Summary:
- Fix support for CXL memory devices with registers offset from the
BAR base.
- Fix the reporting of device capacity.
- Fix the driver commands list definition to be disconnected from the
UAPI command list.
- Replace percpu_ref with rwsem to fix initialization error path.
- Fix leaks in the driver initialization error path.
- Drop the power/ directory from CXL device sysfs.
- Use the recommended sysfs helper for attribute 'show'
implementations"
* tag 'cxl-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/mem: Fix memory device capacity probing
cxl/mem: Fix register block offset calculation
cxl/mem: Force array size of mem_commands[] to CXL_MEM_COMMAND_ID_MAX
cxl/mem: Disable cxl device power management
cxl/mem: Do not rely on device_add() side effects for dev_set_name() failures
cxl/mem: Fix synchronization mechanism for device removal vs ioctl operations
cxl/mem: Use sysfs_emit() for attribute show routines
Kalle Valo [Sat, 17 Apr 2021 08:38:01 +0000 (11:38 +0300)]
Merge tag 'iwlwifi-next-for-kalle-2021-04-12-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next
iwlwifi patches for v5.13
* Add support for new FTM FW APIs;
* Some CSA fixes;
* Support for new HW family and other HW detection fixes;
* Robustness improvement in the HW detection code;
* One fix in PMF;
* Some new regulatory features;
* Support for passive scan in 6GHz;
* Some improvements in the sync queue implementation;
* Support for new devices;
* Support for a new FW API command version;
* Some locking fixes;
* Bump the FW API version support for AX devices;
* Some other small fixes, clean-ups and improvements.
# gpg: Signature made Wed 14 Apr 2021 12:33:29 PM EEST using RSA key ID 1A3CC5FA
# gpg: Good signature from "Luciano Roth Coelho (Luca) <luca@coelho.fi>"
# gpg: aka "Luciano Roth Coelho (Intel) <luciano.coelho@intel.com>"
Kalle Valo [Sat, 17 Apr 2021 08:34:43 +0000 (11:34 +0300)]
Merge tag 'mt76-for-kvalo-2021-04-12' of https://github.com/nbd168/wireless
mt76 patches for 5.13
* code cleanup
* mt7915/mt7615 decap offload support
* driver fixes
* mt7613 eeprom support
* MCU code unification
* threaded NAPI support
* new device IDs
* mt7921 device reset support
* rx timestamp support
# gpg: Signature made Tue 13 Apr 2021 12:11:25 AM EEST using DSA key ID 02A76EF5
# gpg: Good signature from "Felix Fietkau <nbd@nbd.name>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 75D1 1A7D 91A7 710F 4900 42EF D77D 141D 02A7 6EF5
Dan Williams [Sat, 17 Apr 2021 00:43:30 +0000 (17:43 -0700)]
cxl/mem: Fix memory device capacity probing
The CXL Identify Memory Device output payload emits capacity in 256MB
units. The driver is treating the capacity field as bytes. This was
missed because QEMU reports bytes when it should report bytes / 256MB.
This patch used the macro helper mptcp_for_each_subflow() instead of
list_for_each_entry() in mptcp_close.
Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch added a tracepoint in subflow_check_data_avail() to show the
mapping status.
Suggested-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch added a tracepoint in ack_update_msk() to track the
incoming data_ack and window/snd_una updates.
Suggested-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch added a tracepoint in the mapping status function
get_mapping_status() to dump every mpext field.
Suggested-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch added a tracepoint in the packet scheduler function
mptcp_subflow_get_send().
Suggested-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch moved the static function mptcp_subflow_active to protocol.h
as an inline one.
Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Drop 'S' from end of CONFIG_MPTCP_KUNIT_TESTS in order to adhere to the
KUNIT *_KUNIT_TEST config name format.
Fixes: a00a582203db (mptcp: move crypto test to KUNIT) Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Nico Pache <npache@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 17 Apr 2021 00:08:40 +0000 (17:08 -0700)]
Merge branch 'enetc-xdp-fixes'
Vladimir Oltean says:
====================
Fixups for XDP on NXP ENETC
After some more XDP testing on the NXP LS1028A, this is a set of 10 bug
fixes, simplifications and tweaks, ranging from addressing Toke's feedback
(the network stack can run concurrently with XDP on the same TX rings)
to fixing some OOM conditions seen under TX congestion.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Fri, 16 Apr 2021 21:22:25 +0000 (00:22 +0300)]
net: enetc: apply the MDIO workaround for XDP_REDIRECT too
Described in fd5736bf9f23 ("enetc: Workaround for MDIO register access
issue") is a workaround for a hardware bug that requires a register
access of the MDIO controller to never happen concurrently with a
register access of a port PF. To avoid that, a mutual exclusion scheme
with rwlocks was implemented - the port PF accessors are the 'read'
side, and the MDIO accessors are the 'write' side.
When we do XDP_REDIRECT between two ENETC interfaces, all is fine
because the MDIO lock is already taken from the NAPI poll loop.
But when the ingress interface is not ENETC, just the egress is, the
MDIO lock is not taken, so we might access the port PF registers
concurrently with MDIO, which will make the link flap due to wrong
values returned from the PHY.
To avoid this, let's just slap an enetc_lock_mdio/enetc_unlock_mdio at
the beginning and ending of enetc_xdp_xmit. The fact that the MDIO lock
is designed as a rwlock is important here, because the read side is
reentrant (that is one of the main reasons why we chose it). Usually,
the way we benefit of its reentrancy is by running the data path
concurrently on both CPUs, but in this case, we benefit from the
reentrancy by taking the lock even when the lock is already taken
(and that's the situation where ENETC is both the ingress and the egress
interface for XDP_REDIRECT, which was fine before and still is fine now).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Fri, 16 Apr 2021 21:22:24 +0000 (00:22 +0300)]
net: enetc: fix buffer leaks with XDP_TX enqueue rejections
If the TX ring is congested, enetc_xdp_tx() returns false for the
current XDP frame (represented as an array of software BDs).
This array of software TX BDs is constructed in enetc_rx_swbd_to_xdp_tx_swbd
from software BDs freshly cleaned from the RX ring. The issue is that we
scrub the RX software BDs too soon, more precisely before we know that
we can enqueue the TX BDs successfully into the TX ring.
If we can't enqueue them (and enetc_xdp_tx returns false), we call
enetc_xdp_drop which attempts to recycle the buffers held by the RX
software BDs. But because we scrubbed those RX BDs already, two things
happen:
(a) we leak their memory
(b) we populate the RX software BD ring with an all-zero rx_swbd
structure, which makes the buffer refill path allocate more memory.
enetc_refill_rx_ring
-> if (unlikely(!rx_swbd->page))
-> enetc_new_page
That is a recipe for fast OOM.
Fixes: 7ed2bc80074e ("net: enetc: add support for XDP_TX") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Fri, 16 Apr 2021 21:22:22 +0000 (00:22 +0300)]
net: enetc: use dedicated TX rings for XDP
It is possible for one CPU to perform TX hashing (see netdev_pick_tx)
between the 8 ENETC TX rings, and the TX hashing to select TX queue 1.
At the same time, it is possible for the other CPU to already use TX
ring 1 for XDP (either XDP_TX or XDP_REDIRECT). Since there is no mutual
exclusion between XDP and the network stack, we run into an issue
because the ENETC TX procedure is not reentrant.
The obvious approach would be to just make XDP take the lock of the
network stack's TX queue corresponding to the ring it's about to enqueue
in.
For XDP_REDIRECT, this is quite straightforward, a lock at the beginning
and end of enetc_xdp_xmit() should do the trick.
But for XDP_TX, it's a bit more complicated. For one, we do TX batching
all by ourselves for frames with the XDP_TX verdict. This is something
we would like to keep the way it is, for performance reasons. But
batching means that the network stack's lock should be kept from the
first enqueued XDP_TX frame and until we ring the doorbell. That is
mostly fine, except for cases when in the same NAPI loop we have mixed
XDP_TX and XDP_REDIRECT frames. So if enetc_xdp_xmit() gets called while
we are holding the lock from the RX NAPI, then bam, deadlock. The naive
answer could be 'just flush the XDP_TX frames first, then release the
network stack's TX queue lock, then call xdp_do_flush_map()'. But even
xdp_do_redirect() is capable of flushing the batched XDP_REDIRECT
frames, so unless we unlock/relock the TX queue around xdp_do_redirect(),
there simply isn't any clean way to protect XDP_TX from concurrent
network stack .ndo_start_xmit() on another CPU.
So we need to take a different approach, and that is to reserve two
rings for the sole use of XDP. We leave TX rings
0..ndev->real_num_tx_queues-1 to be handled by the network stack, and we
pick them from the end of the priv->tx_ring array.
We make an effort to keep the mapping done by enetc_alloc_msix() which
decides which CPU handles the TX completions of which TX ring in its
NAPI poll. So the XDP TX ring of CPU 0 is handled by TX ring 6, and the
XDP TX ring of CPU 1 is handled by TX ring 7.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Fri, 16 Apr 2021 21:22:21 +0000 (00:22 +0300)]
net: enetc: increase TX ring size
Now that commit d6a2829e82cf ("net: enetc: increase RX ring default
size") has increased the RX ring size, it is quite easy to congest the
TX rings when the traffic is predominantly XDP_TX, as the RX ring is
quite a bit larger than the TX one.
Since we bit the bullet and did the expensive thing already (larger RX
rings consume more memory pages), it seems quite foolish to keep the TX
rings small. So make them equally sized with TX.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
enetc_clean_rx_ring_xdp will 'break', but that 'break' instruction isn't
strong enough to actually break the NAPI poll loop, just the switch/case
statement for XDP actions. So we increment rx_frm_cnt and go to the next
frames minding our own business.
Instead let's do what the skb NAPI poll function does, and break the
loop now, waiting for the memory pressure to go away. Otherwise the next
calls to build_skb() are likely to fail too.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Fri, 16 Apr 2021 21:22:17 +0000 (00:22 +0300)]
net: enetc: rename the buffer reuse helpers
enetc_put_xdp_buff has nothing to do with XDP, frankly, it is just a
helper to populate the recycle end of the shadow RX BD ring
(next_to_alloc) with a given buffer.
On the other hand, enetc_put_rx_buff plays more tricks than its name
would suggest.
So let's rename enetc_put_rx_buff into enetc_flip_rx_buff to reflect the
half-page buffer reuse tricks that it employs, and enetc_put_xdp_buff
into enetc_put_rx_buff which suggests a more garden-variety operation.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 17 Apr 2021 00:06:14 +0000 (17:06 -0700)]
Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
1GbE Intel Wired LAN Driver Updates 2021-04-16
This series contains updates to igb and igc drivers.
Ederson adjusts Tx buffer distributions in Qav mode to improve
TSN-aware traffic for igb. He also enable PPS support and auxiliary PHC
functions for igc.
Grzegorz checks that the MTA register was properly written and
retries if not for igb.
Sasha adds reporting of EEE low power idle counters to ethtool and fixes
a return value being overwritten through looping for igc.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
flow_dissector: Fix out-of-bounds warning in __skb_flow_bpf_to_target()
Fix the following out-of-bounds warning:
net/core/flow_dissector.c:835:3: warning: 'memcpy' offset [33, 48] from the object at 'flow_keys' is out of the bounds of referenced subobject 'ipv6_src' with type '__u32[4]' {aka 'unsigned int[4]'} at offset 16 [-Warray-bounds]
The problem is that the original code is trying to copy data into a
couple of struct members adjacent to each other in a single call to
memcpy(). So, the compiler legitimately complains about it. As these
are just a couple of members, fix this by copying each one of them in
separate calls to memcpy().
This helps with the ongoing efforts to globally enable -Warray-bounds
and get us closer to being able to tighten the FORTIFY_SOURCE routines
on memcpy().
Link: https://github.com/KSPP/linux/issues/109 Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>