The chip cut, also known as the chip version, is a letter from A (0)
to P (15). Recognise them all instead of printing "unknown" when it's
greater than E.
The CCK RSSI calculation is incorrect for the RTL8723BU, RTL8192EU,
and RTL8188FU. Add new functions for these chips with code copied from
their vendor drivers. Use the old code only for the RTL8723AU and
RTL8192CU.
I didn't notice any difference in the reported signal strength with my
RTL8188FU, but I didn't look very hard either.
wifi: rtl8xxxu: Add central frequency offset tracking
According to Realtek programmers, "to adjust oscillator to align
central frequency of connected AP. Then, it can yield better
performance." From commit fb8517f4fade ("rtw88: 8822c: add CFO
tracking").
The RTL8192CU and a version of RTL8723AU apparently don't have the
ability to adjust the oscillator, so this doesn't apply to them.
This also doesn't apply to the wifi + bluetooth combo chips (RTL8723AU
and RTL8723BU) because the CFO tracking should only be done when
bluetooth is disabled, and determining that looked complicated.
That leaves only the RTL8192EU and RTL8188FU chips. I tested this with
the latter.
Jisoo Jang [Tue, 1 Nov 2022 18:36:42 +0000 (03:36 +0900)]
wifi: brcmfmac: Fix potential NULL pointer dereference in 'brcmf_c_preinit_dcmds()'
This patch fixes a NULL pointer dereference bug in brcmfmac that occurs
when ptr which is NULL pointer passed as an argument of strlcpy() in
brcmf_c_preinit_dcmds(). This happens when the driver passes a firmware
version string that does not contain a space " ", making strrchr()
return a null pointer. This patch adds a null pointer check.
Reported-by: Dokyung Song <dokyungs@yonsei.ac.kr> Reported-by: Jisoo Jang <jisoo.jang@yonsei.ac.kr> Reported-by: Minsuk Kang <linuxlovemin@yonsei.ac.kr> Signed-off-by: Jisoo Jang <jisoo.jang@yonsei.ac.kr> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221101183642.166450-1-jisoo.jang@yonsei.ac.kr
Variable stop_report_cnt is being set or incremented but is never
being used for anything meaningful. The variable and code relating
to it's use is redundant and can be removed.
Linus Walleij [Fri, 28 Oct 2022 09:30:00 +0000 (11:30 +0200)]
bcma: Fail probe if GPIO subdriver fails
We currently register the BCMA core even if the GPIO portions
fail. There is no reason for this: the GPIO should register
just fine, if it fails the BCMA driver should fail.
We already gracefully handle the case where the GPIO driver is
not compiled in.
Linus Walleij [Fri, 28 Oct 2022 09:23:32 +0000 (11:23 +0200)]
bcma: Use the proper gpio include
The <linux/bcma/bcma_driver_chipcommon.h> is including the legacy
header <linux/gpio.h> to obtain struct gpio_chip. Instead, include
<linux/gpio/driver.h> where this struct is defined.
It turns out that the brcm80211 brcmsmac depends on this to
bring in the symbol gpio_is_valid().
The driver looks up the BCMA parent GPIO driver and checks that
this succeeds, but then it goes on to use the deprecated GPIO
call gpio_is_valid() to check the consistency of the .base
member of the BCMA GPIO struct. The whole check can be dropped
because the bcma_gpio is initialized in the declarations:
Minsuk Kang [Mon, 24 Oct 2022 07:13:29 +0000 (16:13 +0900)]
wifi: brcmfmac: Fix potential shift-out-of-bounds in brcmf_fw_alloc_request()
This patch fixes a shift-out-of-bounds in brcmfmac that occurs in
BIT(chiprev) when a 'chiprev' provided by the device is too large.
It should also not be equal to or greater than BITS_PER_TYPE(u32)
as we do bitwise AND with a u32 variable and BIT(chiprev). The patch
adds a check that makes the function return NULL if that is the case.
Note that the NULL case is later handled by the bus-specific caller,
brcmf_usb_probe_cb() or brcmf_usb_reset_resume(), for example.
Reported-by: Dokyung Song <dokyungs@yonsei.ac.kr> Reported-by: Jisoo Jang <jisoo.jang@yonsei.ac.kr> Reported-by: Minsuk Kang <linuxlovemin@yonsei.ac.kr> Signed-off-by: Minsuk Kang <linuxlovemin@yonsei.ac.kr> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221024071329.504277-1-linuxlovemin@yonsei.ac.kr
Kees Cook [Tue, 18 Oct 2022 02:37:59 +0000 (19:37 -0700)]
wifi: atmel: Fix atmel_private_handler array size
Fix the atmel_private_handler to correctly sized (1 element) again. (I
should have checked the data segment for differences.) This had no
behavioral impact (no private callbacks), but it made a very large
zero-filled array.
Cc: Simon Kelley <simon@thekelleys.org.uk> Cc: Kalle Valo <kvalo@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: linux-wireless@vger.kernel.org Cc: netdev@vger.kernel.org Fixes: 8af9d4068e86 ("wifi: atmel: Avoid clashing function prototypes") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221018023732.never.700-kees@kernel.org
Chin-Yen Lee [Thu, 27 Oct 2022 05:27:07 +0000 (13:27 +0800)]
wifi: rtw89: add WoWLAN pattern match support
Pattern match is an option of WoWLAN to allow the device to be woken up
from suspend mode when receiving packets matched user-designed patterns.
The patterns are written into hardware via WoWLAN firmware in suspend
flow if users have set up them. If packets matched designed pattern are
received, WoWLAN firmware will send an interrupt and then wake up the
device.
Chin-Yen Lee [Thu, 27 Oct 2022 05:27:06 +0000 (13:27 +0800)]
wifi: rtw89: add WoWLAN function support
WoWLAN is a feature which allows devices to be woken up from suspend
state through WLAN events.
When user enables WoWLAN feature and then let the device enter suspend
state, WoWLAN firmware will be loaded by the driver and periodically
monitors WiFi packets. Power consumption of WiFi chip will be reduced
in this state.
We now implement WoWLAN function in rtw8852ae and rtw8852ce chip.
Currently supported WLAN events include receiving magic packet,
rekey packet and deauth packet, and disconnecting from AP.
Chin-Yen Lee [Thu, 27 Oct 2022 05:27:05 +0000 (13:27 +0800)]
wifi: rtw89: add related H2C for WoWLAN mode
In this patch we define some H2C, which will be called during suspend
flow, to enable WoWLAN function provided by WoWLAN firmware.
These H2C includes keep alive used to send null packet to AP periodically
to avoid being disconnected by AP, disconnect detection used to configure
how we check if AP is offline, wake up control used to decide which WiFi
events could trigger resume flow, and global control used to enable WoWLAN
function.
Chih-Kang Chang [Thu, 27 Oct 2022 05:27:04 +0000 (13:27 +0800)]
wifi: rtw89: add drop tx packet function
When entering WoWLAN mode, we need to drop all transmit packets,
including those in mac buffer, to avoid memory leakage, so implement
the drop_tx function.
Signed-off-by: Chih-Kang Chang <gary.chang@realtek.com> Signed-off-by: Chin-Yen Lee <timlee@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221027052707.14605-5-pkshih@realtek.com
Chih-Kang Chang [Thu, 27 Oct 2022 05:27:03 +0000 (13:27 +0800)]
wifi: rtw89: add function to adjust and restore PLE quota
PLE RX quota, which is the setting of RX buffer, is needed to be adjusted
dynamically for WoWLAN mode, and restored when back to normal mode.
The action is not needed for rtw8852c chip.
Signed-off-by: Chih-Kang Chang <gary.chang@realtek.com> Signed-off-by: Chin-Yen Lee <timlee@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221027052707.14605-4-pkshih@realtek.com
Chih-Kang Chang [Thu, 27 Oct 2022 05:27:02 +0000 (13:27 +0800)]
wifi: rtw89: move enable_cpu/disable_cpu into fw_download
For WoWLAN mode, we need to download WoWLAN firmware by calling
fw_download(). Another, to disable/enable WiFi CPU is needed before
calling fw_download. Since Firmware runs on WiFi CPU, it is intuitive
to combine enable_cpu/disable_cpu functions into fw_download.
Zong-Zhe Yang [Thu, 20 Oct 2022 05:27:01 +0000 (13:27 +0800)]
wifi: rtw89: declare support bands with const
They are just default declarations and we won't modify them directly.
Instead, we actually do moification on their memdup now. So, they
should be declared with const.
Ping-Ke Shih [Thu, 20 Oct 2022 05:25:49 +0000 (13:25 +0800)]
wifi: rtw89: fw: adapt to new firmware format of dynamic header
Since firmware size is limited, we create variant firmwares for variant
application areas. To help driver to know firmware's capabilities, firmware
dynamic header is introduced to have more information, such as firmware
features and firmware compile flags.
Since this driver rtw89 only uses single one specific firmware at runtime,
this patch is just to ignore this dynamic header, not actually use the
content.
This patch can be backward compatible, and no this kind of firmware is
added to linux-firmware yet, so I can prepare this in advance.
Lukasz Czapnik [Thu, 27 Oct 2022 10:42:39 +0000 (03:42 -0700)]
ice: Add additional CSR registers to ETHTOOL_GREGS
In the event of a Tx hang it can be useful to read a variety of hardware
registers to capture some state about why the transmit queue got stuck.
Extend the ETHTOOL_GREGS dump provided by the ice driver with several CSR
registers that provide such relevant information regarding the hardware Tx
state. This enables capturing relevant data to enable debugging such a Tx
hang.
Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Link: https://lore.kernel.org/r/20221027104239.1691549-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 29 Oct 2022 04:56:22 +0000 (21:56 -0700)]
Merge branch 'clean-up-sfp-register-definitions'
Russell King says:
====================
Clean up SFP register definitions
This two-part patch series cleans up the SFP register definitions by
1. converting them from hex to decimal, as all the definitions in the
documents use decimal, this makes it easier to cross-reference.
2. moving the bit definitions for each register along side their
register address definition
====================
net: sfp: move field definitions along side register index
Just as we do for the A2h enum, arrange the A0h enum to have the
field definitions next to their corresponding register index.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: sfp: convert register indexes from hex to decimal
The register indexes in the standards are in decimal rather than hex,
so lets specify them in decimal in the header file so we can easily
cross-reference without converting between hex and decimal.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As a result of invesigations from Frank Wunderlich, we know a lot more
about the Mediatek "SGMII" PCS block, and can implement the PCS support
correctly. This series achieves that, and Frank has tested the final
result and reports that it works for him. The series could do with
further testing by others, but I suspect that is unlikely to happen
until it is merged based on past performances with this driver.
Briefly, the patches in order:
1. Add a new helper to get the link timer duration in nanoseconds
2. Add definitions for the newly discovered registers and updates to
bit definitions, including bitmasks for the BMCR, BMSR and two
advertisement registers.
3. Remove unnecessary/unused error handling (functions always returning
zero.)
4. Adding the missing pcs_get_state() implementation.
5. Converting the code to use regmap_update_bits() rather than
open-coding read-modify-write sequences.
6. Adding out-of-band speed and duplex forcing for all non-inband modes
not just the 802.3z link modes the code currently does.
7. Moving the release of the PHY power down to the main pcs_config()
function.
8. Moving the interface speed selection to the main pcs_config()
function.
9. Adding advertisement programming.
10. Adding correct link timer programming using the new helper in the
first patch.
11. Adding support for 802.3z negotiation.
There is one remaining issue - when configuring the PCS for in-band,
for some reason the AN restart bit is always set. This should not be
necessary, but requires further investigation with the hardware to
find out whether it is really necessary. I suspect this was a work
around for a previous poor implementation.
====================
net: mtk_eth_soc: add support for in-band 802.3z negotiation
As a result of help from Frank Wunderlich to investigate and test, we
now know how to program this PCS for in-band 802.3z negotiation. Add
support for this by moving the contents of the two functions into the
common mtk_pcs_config() function and adding the register settings for
802.3z negotiation.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: mtk_eth_soc: move and correct link timer programming
Program the link timer appropriately for the interface mode being
used, using the newly introduced phylink helper that provides the
nanosecond link timer interval.
The intervals are 1.6ms for SGMII based protocols and 10ms for
802.3z based protocols.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: mtk_eth_soc: add out of band forcing of speed and duplex in pcs_link_up
Add support for forcing the link speed and duplex setting in the
pcs_link_up() method for out of band modes, which will be useful when
we finish converting the pcs_config() method. Until then, we still have
to force duplex for 802.3z modes to work correctly.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: mtk_eth_soc: convert mtk_sgmii to use regmap_update_bits()
mtk_sgmii does a lot of read-modify-write operations, for which there
is a specific regmap function. Use this function instead of open-coding
the operations.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a pcs_get_state() implementation which uses the advertisements
to compute the resulting link modes, and BMSR contents to determine
negotiation and link status.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The functions called by the pcs_config() method always return zero, so
there is no point trying to handle an error from these functions. Make
these functions void, eliminate the "err" variable and simply return
zero from the pcs_config() function itself.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As a result of help from Frank Wunderlich to investigate and test, we
know a bit more about the PCS on the Mediatek platforms. Update the
definitions from this investigation.
This PCS appears similar, but not identical to the Lynx PCS.
Although not included in this patch, but for future reference, the PHY
ID registers at offset 4 read as 0x4d544950 'MTIP'.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a helper to convert the PHY interface mode to the required link
timer setting as stated by the appropriate standard. Inappropriate
interface modes return an error.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thomas Gleixner [Wed, 26 Oct 2022 13:22:15 +0000 (15:22 +0200)]
net: Remove the obsolte u64_stats_fetch_*_irq() users (net).
Now that the 32bit UP oddity is gone and 32bit uses always a sequence
count, there is no need for the fetch_irq() variants anymore.
Convert to the regular interface.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thomas Gleixner [Wed, 26 Oct 2022 13:22:14 +0000 (15:22 +0200)]
net: Remove the obsolte u64_stats_fetch_*_irq() users (drivers).
Now that the 32bit UP oddity is gone and 32bit uses always a sequence
count, there is no need for the fetch_irq() variants anymore.
Convert to the regular interface.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
First set of patches v6.2. mac80211 refactoring continues for Wi-Fi 7.
All mac80211 driver are now converted to use internal TX queues, this
might cause some regressions so we wanted to do this early in the
cycle.
Note: wireless tree was merged[1] to wireless-next to avoid some
conflicts with mac80211 patches between the trees. Unfortunately there
are still two smaller conflicts in net/mac80211/util.c which Stephen
also reported[2]. In the first conflict initialise scratch_len to
"params->scratch_len ?: 3 * params->len" (note number 3, not 2!) and
in the second conflict take the version which uses elems->scratch_pos.
mac80211
- preparation for Wi-Fi 7 Multi-Link Operation (MLO) continues
- add API to show the link STAs in debugfs
- all mac80211 drivers are now using mac80211 internal TX queues (iTXQs)
rtw89
- support 8852BE
rtl8xxxu
- support RTL8188FU
brmfmac
- support two station interfaces concurrently
Nagarajan Maran [Fri, 14 Oct 2022 15:50:54 +0000 (21:20 +0530)]
wifi: ath11k: fix monitor vdev creation with firmware recovery
During firmware recovery, the monitor interface is not
getting created in the driver and firmware since
the respective flags are not updated properly.
So after firmware recovery is successful, when monitor
interface is brought down manually, firmware assertion
is observed, since we are trying to bring down the
interface which is not yet created in the firmware.
Fix this by updating the monitor flags properly per
phy#, during firmware recovery.
Dmitry Torokhov [Thu, 27 Oct 2022 07:34:02 +0000 (00:34 -0700)]
nfc: s3fwrn5: use devm_clk_get_optional_enabled() helper
Because we enable the clock immediately after acquiring it in probe,
we can combine the 2 operations and use devm_clk_get_optional_enabled()
helper.
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch series adds support for WangXun 10 gigabit NIC, to initialize
hardware, set mac address, and register netdev.
Change log:
v6: address comments:
Jakub Kicinski: check with scripts/kernel-doc
v5: address comments:
Jakub Kicinski: clean build with W=1 C=1
v4: address comments:
Andrew Lunn: https://lore.kernel.org/all/YzXROBtztWopeeaA@lunn.ch/
v3: address comments:
Andrew Lunn: remove hw function ops, reorder functions, use BIT(n)
for register bit offset, move the same code of txgbe
and ngbe to libwx
v2: address comments:
Andrew Lunn: https://lore.kernel.org/netdev/YvRhld5rD%2FxgITEg@lunn.ch/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 28 Oct 2022 09:47:42 +0000 (10:47 +0100)]
Merge branch 'tcp-plb'
Mubashir Adnan Qureshi says:
====================
net: Add PLB functionality to TCP
This patch series adds PLB (Protective Load Balancing) to TCP and hooks
it up to DCTCP. PLB is disabled by default and can be enabled using
relevant sysctls and support from underlying CC.
PLB (Protective Load Balancing) is a host based mechanism for load
balancing across switch links. It leverages congestion signals(e.g. ECN)
from transport layer to randomly change the path of the connection
experiencing congestion. PLB changes the path of the connection by
changing the outgoing IPv6 flow label for IPv6 connections (implemented
in Linux by calling sk_rethink_txhash()). Because of this implementation
mechanism, PLB can currently only work for IPv6 traffic. For more
information, see the SIGCOMM 2022 paper:
https://doi.org/10.1145/3544216.3544226
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
rcv_wnd can be useful to diagnose TCP performance where receiver window
becomes the bottleneck. rehash reports the PLB and timeout triggered
rehash attempts by the TCP connection.
Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
tcp: add u32 counter in tcp_sock and an SNMP counter for PLB
A u32 counter is added to tcp_sock for counting the number of PLB
triggered rehashes for a TCP connection. An SNMP counter is also
added to count overall PLB triggered rehash events for a host. These
counters are hooked up to PLB implementation for DCTCP.
TCP_NLA_REHASH is added to SCM_TIMESTAMPING_OPT_STATS that reports
the rehash attempts triggered due to PLB or timeouts. This gives
a historical view of sustained congestion or timeouts experienced
by the TCP connection.
Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
PLB support is added to TCP DCTCP code. As DCTCP uses ECN as the
congestion signal, PLB also uses ECN to make decisions whether to change
the path or not upon sustained congestion.
Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Congestion control algorithms track PLB state and cause the connection
to trigger a path change when either of the 2 conditions is satisfied:
- No packets are in flight and (# consecutive congested rounds >=
sysctl_tcp_plb_idle_rehash_rounds)
- (# consecutive congested rounds >= sysctl_tcp_plb_rehash_rounds)
A round (RTT) is marked as congested when congestion signal
(ECN ce_ratio) over an RTT is greater than sysctl_tcp_plb_cong_thresh.
In the event of RTO, PLB (via tcp_write_timeout()) triggers a path
change and disables congestion-triggered path changes for random time
between (sysctl_tcp_plb_suspend_rto_sec, 2*sysctl_tcp_plb_suspend_rto_sec)
to avoid hopping onto the "connectivity blackhole". RTO-triggered
path changes can still happen during this cool-off period.
Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
PLB (Protective Load Balancing) is a host based mechanism for load
balancing across switch links. It leverages congestion signals(e.g. ECN)
from transport layer to randomly change the path of the connection
experiencing congestion. PLB changes the path of the connection by
changing the outgoing IPv6 flow label for IPv6 connections (implemented
in Linux by calling sk_rethink_txhash()). Because of this implementation
mechanism, PLB can currently only work for IPv6 traffic. For more
information, see the SIGCOMM 2022 paper:
https://doi.org/10.1145/3544216.3544226
This commit adds new sysctl knobs and sets their default values for
TCP PLB.
Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
Netfilter updates for net-next
1) Move struct nft_payload_set definition to .c file where it is
only used.
2) Shrink transport and inner header offset fields in the nft_pktinfo
structure to 16-bits, from Florian Westphal.
3) Get rid of nft_objref Kbuild toggle, make it built-in into
nf_tables. This expression is used to instantiate conntrack helpers
in nftables. After removing the conntrack helper auto-assignment
toggle it this feature became more important so move it to the nf_tables
core module. Also from Florian.
4) Extend the existing function to calculate payload inner header offset
to deal with the GRE and IPIP transport protocols.
6) Add inner expression support for nf_tables. This new expression
provides a packet parser for tunneled packets which uses a userspace
description of the expected inner headers. The inner expression
invokes the payload expression (via direct call) to match on the
inner header protocol fields using the inner link, network and
transport header offsets.
An example of the bytecode generated from userspace to match on
IP source encapsulated in a VxLAN packet:
# nft --debug=netlink add rule netdev x y udp dport 4789 vxlan ip saddr 1.2.3.4
netdev x y
[ meta load l4proto => reg 1 ]
[ cmp eq reg 1 0x00000011 ]
[ payload load 2b @ transport header + 2 => reg 1 ]
[ cmp eq reg 1 0x0000b512 ]
[ inner type vxlan hdrsize 8 flags f [ meta load protocol => reg 1 ] ]
[ cmp eq reg 1 0x00000008 ]
[ inner type vxlan hdrsize 8 flags f [ payload load 4b @ network header + 12 => reg 1 ] ]
[ cmp eq reg 1 0x04030201 ]
7) Store inner link, network and transport header offsets in percpu
area to parse inner packet header once only. Matching on a different
tunnel type invalidates existing offsets in the percpu area and it
invokes the inner tunnel parser again.
8) Add support for inner meta matching. This support for
NFTA_META_PROTOCOL, which specifies the inner ethertype, and
NFT_META_L4PROTO, which specifies the inner transport protocol.
9) Extend nft_inner to parse GENEVE optional fields to calculate the
link layer offset.
10) Update inner expression so tunnel offset points to GRE header
to normalize tunnel header handling. This also allows to perform
different interpretations of the GRE header from userspace.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: nft_inner: set tunnel offset to GRE header offset
netfilter: nft_inner: add geneve support
netfilter: nft_meta: add inner match support
netfilter: nft_inner: add percpu inner context
netfilter: nft_inner: support for inner tunnel header matching
netfilter: nft_payload: access ipip payload for inner offset
netfilter: nft_payload: access GRE payload via inner offset
netfilter: nft_objref: make it builtin
netfilter: nf_tables: reduce nft_pktinfo by 8 bytes
netfilter: nft_payload: move struct nft_payload_set definition where it belongs
====================
====================
ionic: VF attr replay and other updates
For better VF management when a FW update restart or a FW crash recover is
detected, the PF now will replay any user specified VF attributes to be
sure the FW hasn't lost them in the restart.
Newer FW offers more packet processing offloads, so we now support them in
the driver.
A small refactor of the Rx buffer fill cleans a bit of code and will help
future work on buffer caching.
====================
Shannon Nelson [Wed, 26 Oct 2022 14:37:42 +0000 (07:37 -0700)]
ionic: new ionic device identity level and VF start control
A new ionic dev_cmd is added to the interface in ionic_if.h,
with a new capabilities field in the ionic device identity to
signal its availability in the FW. The identity level code is
incremented to '2' to show support for this new capabilities
bitfield.
If the driver has indicated with the new identity level that
it has the VF_CTRL command, newer FW will wait for the start
command before starting the VFs after a FW update or crash
recovery.
This patch updates the driver to make use of the new VF start
control in fw_up path to be sure that the PF has set the user
attributes on the VF before the FW allows the VFs to restart.
Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Shannon Nelson [Wed, 26 Oct 2022 14:37:41 +0000 (07:37 -0700)]
ionic: only save the user set VF attributes
Report the current FW values for the VF attributes, but don't
save the FW values locally, only save the vf attributes that
are given to us from the user. This allows us to replay user
data, and doesn't end up confusing things like "who set the
mac address".
Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Shannon Nelson [Wed, 26 Oct 2022 14:37:40 +0000 (07:37 -0700)]
ionic: replay VF attributes after fw crash recovery
The VF attributes that the user has set into the FW through
the PF can be lost over a FW crash recovery. Much like we
already replay the PF mac/vlan filters, we now add a replay
in the recovery path to be sure the FW has the up-to-date
VF configurations.
Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c 2871edb32f46 ("can: kvaser_usb: Fix possible completions during init_completion") abb8670938b2 ("can: kvaser_usb_leaf: Ignore stale bus-off after start") 8d21f5927ae6 ("can: kvaser_usb_leaf: Fix improved state not being reported")
Linus Torvalds [Thu, 27 Oct 2022 20:36:59 +0000 (13:36 -0700)]
Merge tag 'net-6.1-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from 802.15.4 (Zigbee et al).
Current release - regressions:
- ipa: fix bugs in the register conversion for IPA v3.1 and v3.5.1
Current release - new code bugs:
- mptcp: fix abba deadlock on fastopen
- eth: stmmac: rk3588: allow multiple gmac controllers in one system
Previous releases - regressions:
- ip: rework the fix for dflt addr selection for connected nexthop
- net: couple more fixes for misinterpreting bits in struct page
after the signature was added
Previous releases - always broken:
- ipv6: ensure sane device mtu in tunnels
- openvswitch: switch from WARN to pr_warn on a user-triggerable path
- ethtool: eeprom: fix null-deref on genl_info in dump
- ieee802154: more return code fixes for corner cases in
dgram_sendmsg
- mac802154: fix link-quality-indicator recording
- eth: mlx5: fixes for IPsec, PTP timestamps, OvS and conntrack
offload
- eth: fec: limit register access on i.MX6UL
- eth: bcm4908_enet: update TX stats after actual transmission
- can: rcar_canfd: improve IRQ handling for RZ/G2L
Misc:
- genetlink: piggy back on the newly added resv_op_start to enforce
more sanity checks on new commands"
* tag 'net-6.1-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
net: enetc: survive memory pressure without crashing
kcm: do not sense pfmemalloc status in kcm_sendpage()
net: do not sense pfmemalloc status in skb_append_pagefrags()
net/mlx5e: Fix macsec sci endianness at rx sa update
net/mlx5e: Fix wrong bitwise comparison usage in macsec_fs_rx_add_rule function
net/mlx5e: Fix macsec rx security association (SA) update/delete
net/mlx5e: Fix macsec coverity issue at rx sa update
net/mlx5: Fix crash during sync firmware reset
net/mlx5: Update fw fatal reporter state on PCI handlers successful recover
net/mlx5e: TC, Fix cloned flow attr instance dests are not zeroed
net/mlx5e: TC, Reject forwarding from internal port to internal port
net/mlx5: Fix possible use-after-free in async command interface
net/mlx5: ASO, Create the ASO SQ with the correct timestamp format
net/mlx5e: Update restore chain id for slow path packets
net/mlx5e: Extend SKB room check to include PTP-SQ
net/mlx5: DR, Fix matcher disconnect error flow
net/mlx5: Wait for firmware to enable CRS before pci_restore_state
net/mlx5e: Do not increment ESN when updating IPsec ESN state
netdevsim: remove dir in nsim_dev_debugfs_init() when creating ports dir failed
netdevsim: fix memory leak in nsim_drv_probe() when nsim_dev_resources_register() failed
...
Linus Torvalds [Thu, 27 Oct 2022 20:16:36 +0000 (13:16 -0700)]
Merge tag 'execve-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull execve fixes from Kees Cook:
- Fix an ancient signal action copy race (Bernd Edlinger)
- Fix a memory leak in ELF loader, when under memory pressure (Li
Zetao)
* tag 'execve-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
fs/binfmt_elf: Fix memory leak in load_elf_binary()
exec: Copy oldsighand->action under spin-lock
Linus Torvalds [Thu, 27 Oct 2022 19:31:57 +0000 (12:31 -0700)]
Merge tag 'hardening-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening fixes from Kees Cook:
- Fix older Clang vs recent overflow KUnit test additions (Nick
Desaulniers, Kees Cook)
- Fix kern-doc visibility for overflow helpers (Kees Cook)
* tag 'hardening-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
overflow: Refactor test skips for Clang-specific issues
overflow: disable failing tests for older clang versions
overflow: Fix kern-doc markup for functions
Linus Torvalds [Thu, 27 Oct 2022 19:21:57 +0000 (12:21 -0700)]
Merge tag 'media/v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
"A bunch of patches addressing issues in the vivid driver and adding
new checks in V4L2 to validate the input parameters from some ioctls"
* tag 'media/v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: vivid.rst: loop_video is set on the capture devnode
media: vivid: set num_in/outputs to 0 if not supported
media: vivid: drop GFP_DMA32
media: vivid: fix control handler mutex deadlock
media: videodev2.h: V4L2_DV_BT_BLANKING_HEIGHT should check 'interlaced'
media: v4l2-dv-timings: add sanity checks for blanking values
media: vivid: dev->bitmap_cap wasn't freed in all cases
media: vivid: s_fbuf: add more sanity checks
Vladimir Oltean [Thu, 27 Oct 2022 18:29:25 +0000 (21:29 +0300)]
net: enetc: survive memory pressure without crashing
Under memory pressure, enetc_refill_rx_ring() may fail, and when called
during the enetc_open() -> enetc_setup_rxbdr() procedure, this is not
checked for.
An extreme case of memory pressure will result in exactly zero buffers
being allocated for the RX ring, and in such a case it is expected that
hardware drops all RX packets due to lack of buffers.
This does not happen, because the reset-default value of the consumer
and produces index is 0, and this makes the ENETC think that all buffers
have been initialized and that it owns them (when in reality none were).
The hardware guide explains this best:
| Configure the receive ring producer index register RBaPIR with a value
| of 0. The producer index is initially configured by software but owned
| by hardware after the ring has been enabled. Hardware increments the
| index when a frame is received which may consume one or more BDs.
| Hardware is not allowed to increment the producer index to match the
| consumer index since it is used to indicate an empty condition. The ring
| can hold at most RBLENR[LENGTH]-1 received BDs.
|
| Configure the receive ring consumer index register RBaCIR. The
| consumer index is owned by software and updated during operation of the
| of the BD ring by software, to indicate that any receive data occupied
| in the BD has been processed and it has been prepared for new data.
| - If consumer index and producer index are initialized to the same
| value, it indicates that all BDs in the ring have been prepared and
| hardware owns all of the entries.
| - If consumer index is initialized to producer index plus N, it would
| indicate N BDs have been prepared. Note that hardware cannot start if
| only a single buffer is prepared due to the restrictions described in
| (2).
| - Software may write consumer index to match producer index anytime
| while the ring is operational to indicate all received BDs prior have
| been processed and new BDs prepared for hardware.
Normally, the value of rx_ring->rcir (consumer index) is brought in sync
with the rx_ring->next_to_use software index, but this only happens if
page allocation ever succeeded.
When PI==CI==0, the hardware appears to receive frames and write them to
DMA address 0x0 (?!), then set the READY bit in the BD.
The enetc_clean_rx_ring() function (and its XDP derivative) is naturally
not prepared to handle such a condition. It will attempt to process
those frames using the rx_swbd structure associated with index i of the
RX ring, but that structure is not fully initialized (enetc_new_page()
does all of that). So what happens next is undefined behavior.
To operate using no buffer, we must initialize the CI to PI + 1, which
will block the hardware from advancing the CI any further, and drop
everything.
The issue was seen while adding support for zero-copy AF_XDP sockets,
where buffer memory comes from user space, which can even decide to
supply no buffers at all (example: "xdpsock --txonly"). However, the bug
is present also with the network stack code, even though it would take a
very determined person to trigger a page allocation failure at the
perfect time (a series of ifup/ifdown under memory pressure should
eventually reproduce it given enough retries).
Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://lore.kernel.org/r/20221027182925.3256653-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Thu, 27 Oct 2022 04:03:46 +0000 (04:03 +0000)]
net: do not sense pfmemalloc status in skb_append_pagefrags()
skb_append_pagefrags() is used by af_unix and udp sendpage()
implementation so far.
In commit 326140063946 ("tcp: TX zerocopy should not sense
pfmemalloc status") we explained why we should not sense
pfmemalloc status for pages owned by user space.
We should also use skb_fill_page_desc_noacc()
in skb_append_pagefrags() to avoid following KCSAN report:
BUG: KCSAN: data-race in lru_add_fn / skb_append_pagefrags
value changed: 0x0000000000000000 -> 0xffffea00058fc188
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 17325 Comm: syz-executor.0 Not tainted 6.1.0-rc1-syzkaller-00158-g440b7895c990-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/11/2022
Fixes: 326140063946 ("tcp: TX zerocopy should not sense pfmemalloc status") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20221027040346.1104204-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Raed Salem [Wed, 26 Oct 2022 13:51:53 +0000 (14:51 +0100)]
net/mlx5e: Fix macsec sci endianness at rx sa update
The cited commit at rx sa update operation passes the sci object
attribute, in the wrong endianness and not as expected by the HW
effectively create malformed hw sa context in case of update rx sa
consequently, HW produces unexpected MACsec packets which uses this
sa.
Fix by passing sci to create macsec object with the correct endianness,
while at it add __force u64 to prevent sparse check error of type
"sparse: error: incorrect type in assignment".
Raed Salem [Wed, 26 Oct 2022 13:51:52 +0000 (14:51 +0100)]
net/mlx5e: Fix wrong bitwise comparison usage in macsec_fs_rx_add_rule function
The cited commit produces a sparse check error of type
"sparse: error: restricted __be64 degrades to integer". The
offending line wrongly did a bitwise operation between two different
storage types one of 64 bit when the other smaller side is 16 bit
which caused the above sparse error, furthermore bitwise operation
usage here is wrong in the first place as the constant MACSEC_PORT_ES
is not a bitwise field.
Fix by using the right mask to get the lower 16 bit if the sci number,
and use comparison operator '==' instead of bitwise '&' operator.
Raed Salem [Wed, 26 Oct 2022 13:51:51 +0000 (14:51 +0100)]
net/mlx5e: Fix macsec rx security association (SA) update/delete
The cited commit adds the support for update/delete MACsec Rx SA,
naturally, these operations need to check if the SA in question exists
to update/delete the SA and return error code otherwise, however they
do just the opposite i.e. return with error if the SA exists
Fix by change the check to return error in case the SA in question does
not exist, adjust error message and code accordingly.
Raed Salem [Wed, 26 Oct 2022 13:51:50 +0000 (14:51 +0100)]
net/mlx5e: Fix macsec coverity issue at rx sa update
The cited commit at update rx sa operation passes object attributes
to MACsec object create function without initializing/setting all
attributes fields leaving some of them with garbage values, therefore
violating the implicit assumption at create object function, which
assumes that all input object attributes fields are set.
Fix by initializing the object attributes struct to zero, thus leaving
unset fields with the legal zero value.
When setting Bluefield to DPU NIC mode using mlxconfig tool + sync
firmware reset flow, we run into scenario where the host was not
eswitch manager at the time of mlx5 driver load but becomes eswitch manager
after the sync firmware reset flow. This results in null pointer
access of mpfs structure during mac filter add. This change prevents null
pointer access but mpfs table entries will not be added.
Fixes: 5ec697446f46 ("net/mlx5: Add support for devlink reload action fw activate") Signed-off-by: Suresh Devarakonda <ramad@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Bodong Wang <bodong@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-12-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Roy Novich [Wed, 26 Oct 2022 13:51:48 +0000 (14:51 +0100)]
net/mlx5: Update fw fatal reporter state on PCI handlers successful recover
Update devlink health fw fatal reporter state to "healthy" is needed by
strictly calling devlink_health_reporter_state_update() after recovery
was done by PCI error handler. This is needed when fw_fatal reporter was
triggered due to PCI error. Poll health is called and set reporter state
to error. Health recovery failed (since EEH didn't re-enable the PCI).
PCI handlers keep on recover flow and succeed later without devlink
acknowledgment. Fix this by adding devlink state update at the end of
the PCI handler recovery process.
Fixes: 6181e5cb752e ("devlink: add support for reporter recovery completion") Signed-off-by: Roy Novich <royno@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-11-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Roi Dayan [Wed, 26 Oct 2022 13:51:47 +0000 (14:51 +0100)]
net/mlx5e: TC, Fix cloned flow attr instance dests are not zeroed
On multi table split the driver creates a new attr instance with
data being copied from prev attr instance zeroing action flags.
Also need to reset dests properties to avoid incorrect dests per attr.
Fixes: 8300f225268b ("net/mlx5e: Create new flow attr for multi table actions") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-10-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ariel Levkovich [Wed, 26 Oct 2022 13:51:46 +0000 (14:51 +0100)]
net/mlx5e: TC, Reject forwarding from internal port to internal port
Reject TC rules that forward from internal port to internal port
as it is not supported.
This include rules that are explicitly have internal port as
the filter device as well as rules that apply on tunnel interfaces
as the route device for the tunnel interface can be an internal
port.
Fixes: 27484f7170ed ("net/mlx5e: Offload tc rules that redirect to ovs internal port") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-9-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tariq Toukan [Wed, 26 Oct 2022 13:51:45 +0000 (14:51 +0100)]
net/mlx5: Fix possible use-after-free in async command interface
mlx5_cmd_cleanup_async_ctx should return only after all its callback
handlers were completed. Before this patch, the below race between
mlx5_cmd_cleanup_async_ctx and mlx5_cmd_exec_cb_handler was possible and
lead to a use-after-free:
1. mlx5_cmd_cleanup_async_ctx is called while num_inflight is 2 (i.e.
elevated by 1, a single inflight callback).
2. mlx5_cmd_cleanup_async_ctx decreases num_inflight to 1.
3. mlx5_cmd_exec_cb_handler is called, decreases num_inflight to 0 and
is about to call wake_up().
4. mlx5_cmd_cleanup_async_ctx calls wait_event, which returns
immediately as the condition (num_inflight == 0) holds.
5. mlx5_cmd_cleanup_async_ctx returns.
6. The caller of mlx5_cmd_cleanup_async_ctx frees the mlx5_async_ctx
object.
7. mlx5_cmd_exec_cb_handler goes on and calls wake_up() on the freed
object.
Fix it by syncing using a completion object. Mark it completed when
num_inflight reaches 0.
Trace:
BUG: KASAN: use-after-free in do_raw_spin_lock+0x23d/0x270
Read of size 4 at addr ffff888139cd12f4 by task swapper/5/0
Saeed Mahameed [Wed, 26 Oct 2022 13:51:44 +0000 (14:51 +0100)]
net/mlx5: ASO, Create the ASO SQ with the correct timestamp format
mlx5 SQs must select the timestamp format explicitly according to the
active clock mode, select the current active timestamp mode so ASO SQ create
will succeed.
This fixes the following error prints when trying to create ipsec ASO SQ
while the timestamp format is real time mode.
mlx5_cmd_out_err:778:(pid 34874): CREATE_SQ(0x904) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0xd61c0b), err(-22)
mlx5_aso_create_sq:285:(pid 34874): Failed to open aso wq sq, err=-22
mlx5e_ipsec_init:436:(pid 34874): IPSec initialization failed, -22
Fixes: cdd04f4d4d71 ("net/mlx5: Add support to create SQ and CQ for ASO") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reported-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-7-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paul Blakey [Wed, 26 Oct 2022 13:51:43 +0000 (14:51 +0100)]
net/mlx5e: Update restore chain id for slow path packets
Currently encap slow path rules just forward to software without
setting the chain id miss register, so driver doesn't restore
the chain, and packets hitting this rule will restart from tc chain
0 instead of continuing to the chain the encap rule was on.
Fix this by setting the chain id miss register to the chain id mapping.
Fixes: 8f1e0b97cc70 ("net/mlx5: E-Switch, Mark miss packets with new chain id mapping") Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-6-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Aya Levin [Wed, 26 Oct 2022 13:51:42 +0000 (14:51 +0100)]
net/mlx5e: Extend SKB room check to include PTP-SQ
When tx_port_ts is set, the driver diverts all UPD traffic over PTP port
to a dedicated PTP-SQ. The SKBs are cached until the wire-CQE arrives.
When the packet size is greater then MTU, the firmware might drop it and
the packet won't be transmitted to the wire, hence the wire-CQE won't
reach the driver. In this case the SKBs are accumulated in the SKB fifo.
Add room check to consider the PTP-SQ SKB fifo, when the SKB fifo is
full, driver stops the queue resulting in a TX timeout. Devlink
TX-reporter can recover from it.
Fixes: 1880bc4e4a96 ("net/mlx5e: Add TX port timestamp support") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-5-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rongwei Liu [Wed, 26 Oct 2022 13:51:41 +0000 (14:51 +0100)]
net/mlx5: DR, Fix matcher disconnect error flow
When 2nd flow rules arrives, it will merge together with the
1st one if matcher criteria is the same.
If merge fails, driver will rollback the merge contents, and
reject the 2nd rule. At rollback stage, matcher can't be
disconnected unconditionally, otherise the 1st rule can't be
hit anymore.
Add logic to check if the matcher should be disconnected or not.
Fixes: cc2295cd54e4 ("net/mlx5: DR, Improve steering for empty or RX/TX-only matchers") Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-4-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Moshe Shemesh [Wed, 26 Oct 2022 13:51:40 +0000 (14:51 +0100)]
net/mlx5: Wait for firmware to enable CRS before pci_restore_state
After firmware reset driver should verify firmware already enabled CRS
and became responsive to pci config cycles before restoring pci state.
Fix that by waiting till device_id is readable through PCI again.
Hyong Youb Kim [Wed, 26 Oct 2022 13:51:39 +0000 (14:51 +0100)]
net/mlx5e: Do not increment ESN when updating IPsec ESN state
An offloaded SA stops receiving after about 2^32 + replay_window
packets. For example, when SA reaches <seq-hi 0x1, seq 0x2c>, all
subsequent packets get dropped with SA-icv-failure (integrity_failed).
To reproduce the bug:
- ConnectX-6 Dx with crypto enabled (FW 22.30.1004)
- ipsec.conf:
nic-offload = yes
replay-window = 32
esn = yes
salifetime=24h
- Run netperf for a long time to send more than 2^32 packets
netperf -H <device-under-test> -t TCP_STREAM -l 20000
When 2^32 + replay_window packets are received, the replay window
moves from the 2nd half of subspace (overlap=1) to the 1st half
(overlap=0). The driver then updates the 'esn' value in NIC
(i.e. seq_hi) as follows.
seq_hi = xfrm_replay_seqhi(seq_bottom)
new esn in NIC = seq_hi + 1
The +1 increment is wrong, as seq_hi already contains the correct
seq_hi. For example, when seq_hi=1, the driver actually tells NIC to
use seq_hi=2 (esn). This incorrect esn value causes all subsequent
packets to fail integrity checks (SA-icv-failure). So, do not
increment.
Fixes: cb01008390bb ("net/mlx5: IPSec, Add support for ESN") Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-2-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Zhengchao Shao [Wed, 26 Oct 2022 01:46:42 +0000 (09:46 +0800)]
netdevsim: remove dir in nsim_dev_debugfs_init() when creating ports dir failed
Remove dir in nsim_dev_debugfs_init() when creating ports dir failed.
Otherwise, the netdevsim device will not be created next time. Kernel
reports an error: debugfs: Directory 'netdevsim1' with parent 'netdevsim'
already present!
Fixes: ab1d0cc004d7 ("netdevsim: change debugfs tree topology") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Zhengchao Shao [Wed, 26 Oct 2022 01:54:05 +0000 (09:54 +0800)]
netdevsim: fix memory leak in nsim_bus_dev_new()
If device_register() failed in nsim_bus_dev_new(), the value of reference
in nsim_bus_dev->dev is 1. obj->name in nsim_bus_dev->dev will not be
released.
Jakub Kicinski [Thu, 27 Oct 2022 17:30:41 +0000 (10:30 -0700)]
Merge tag 'linux-can-fixes-for-6.1-20221027' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2022-10-27
Anssi Hannula fixes the use of the completions in the kvaser_usb
driver.
Biju Das contributes 2 patches for the rcar_canfd driver. A IRQ storm
that can be triggered by high CAN bus load and channel specific IRQ
handlers are fixed.
Yang Yingliang fixes the j1939 transport protocol by moving a
kfree_skb() out of a spin_lock_irqsave protected section.
* tag 'linux-can-fixes-for-6.1-20221027' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: j1939: transport: j1939_session_skb_drop_old(): spin_unlock_irqrestore() before kfree_skb()
can: rcar_canfd: fix channel specific IRQ handling for RZ/G2L
can: rcar_canfd: rcar_canfd_handle_global_receive(): fix IRQ storm on global FIFO receive
can: kvaser_usb: Fix possible completions during init_completion
====================
Rafał Miłecki [Thu, 27 Oct 2022 11:24:30 +0000 (13:24 +0200)]
net: broadcom: bcm4908_enet: update TX stats after actual transmission
Queueing packets doesn't guarantee their transmission. Update TX stats
after hardware confirms consuming submitted data.
This also fixes a possible race and NULL dereference.
bcm4908_enet_start_xmit() could try to access skb after freeing it in
the bcm4908_enet_poll_tx().