Paolo Abeni [Tue, 30 Mar 2021 10:28:50 +0000 (12:28 +0200)]
udp: skip L4 aggregation for UDP tunnel packets
If NETIF_F_GRO_FRAGLIST or NETIF_F_GRO_UDP_FWD are enabled, and there
are UDP tunnels available in the system, udp_gro_receive() could end-up
doing L4 aggregation (either SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST) at
the outer UDP tunnel level for packets effectively carrying and UDP
tunnel header.
That could cause inner protocol corruption. If e.g. the relevant
packets carry a vxlan header, different vxlan ids will be ignored/
aggregated to the same GSO packet. Inner headers will be ignored, too,
so that e.g. TCP over vxlan push packets will be held in the GRO
engine till the next flush, etc.
Just skip the SKB_GSO_UDP_L4 and SKB_GSO_FRAGLIST code path if the
current packet could land in a UDP tunnel, and let udp_gro_receive()
do GRO via udp_sk(sk)->gro_receive.
The check implemented in this patch is broader than what is strictly
needed, as the existing UDP tunnel could be e.g. configured on top of
a different device: we could end-up skipping GRO at-all for some packets.
Anyhow, that is a very thin corner case and covering it will add quite
a bit of complexity.
v1 -> v2:
- hopefully clarify the commit message
Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.") Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets") Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ip_summed will be set to CHECKSUM_PARTIAL at creation time and
such checksum mode will be preserved in the above path up to the
UDP tunnel receive code where we have:
The UDP GSO packet will be later segmented as part of the rx socket
receive operation, and will present a CHECKSUM_NONE after segmentation.
Additionally the segmented packets UDP CB still refers to the original
GSO packet len. Overall that causes unexpected/wrong csum validation
errors later in the UDP receive path.
We could possibly address the issue with some additional checks and
csum mangling in the UDP tunnel code. Since the issue affects only
this UDP receive slow path, let's set a suitable csum status there.
Note that SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST packets lacking an UDP
encapsulation present a valid checksum when landing to udp_queue_rcv_skb(),
as the UDP checksum has been validated by the GRO engine.
v2 -> v3:
- even more verbose commit message and comments
v1 -> v2:
- restrict the csum update to the packets strictly needing them
- hopefully clarify the commit message and code comments
Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Calling "zlib_inflateEnd(&state->strm)" is only useful for its return
value, which is ignored.
Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 30 Mar 2021 07:27:56 +0000 (15:27 +0800)]
net: ipa: remove repeated words
Remove repeated words "that" and "the".
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Acked-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 30 Mar 2021 07:27:55 +0000 (15:27 +0800)]
net: phy: remove repeated word
Remove repeated word "to".
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 30 Mar 2021 07:27:54 +0000 (15:27 +0800)]
net: bonding: remove repeated word
Remove repeated word "that".
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Tue, 30 Mar 2021 07:27:53 +0000 (15:27 +0800)]
net: i40e: remove repeated words
Remove repeated words "to" and "try".
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 30 Mar 2021 23:54:50 +0000 (16:54 -0700)]
Merge branch 'obsdolete-todo'
Wang Qing says:
====================
Clean up obsolete TODO files
It is mentioned in the official documents of the Linux Foundation and WIKI
that you can participate in its development according to the TODO files of
each module.
But the TODO files here has not been updated for 15 years, and the function
development described in the file have been implemented or abandoned.
Its existence will mislead developers seeking to view outdated information.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Qing [Tue, 30 Mar 2021 07:02:45 +0000 (15:02 +0800)]
scsi/aacraid: Delete obsolete TODO file
The TODO file here has not been updated from 2.6.12 for more than 15 years.
Its existence will mislead developers seeking to view outdated information.
Signed-off-by: Wang Qing <wangqing@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Tue, 30 Mar 2021 12:55:39 +0000 (20:55 +0800)]
net: mhi: remove pointless conditional before kfree_skb()
It already has null pointer check in kfree_skb(),
remove pointless pointer check before kfree_skb().
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Cross time-stamping mechanism used in certain instance of Intel mGbE
may run at different clock frequency in comparison to the clock
frequency used by processor, so we introduce cross T/S frequency
adjustment to ensure TSC calculation is correct when processor got the
cross time-stamps.
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 30 Mar 2021 20:29:39 +0000 (13:29 -0700)]
Merge branch 'rfc8335-probe'
Andreas Roeseler says:
====================
add support for RFC 8335 PROBE
The popular utility ping has several severe limitations, such as the
inability to query specific interfaces on a node and requiring
bidirectional connectivity between the probing and probed interfaces.
RFC 8335 attempts to solve these limitations by creating the new utility
PROBE which is a specialized ICMP message that makes use of the ICMP
Extension Structure outlined in RFC 4884.
This patchset adds definitions for the ICMP Extended Echo Request and
Reply (PROBE) types for both IPV4 and IPV6, adds a sysctl to enable
responses to PROBE messages, expands the list of supported ICMP messages
to accommodate PROBE types, adds ipv6_dev_find into ipv6_stubs, and adds
functionality to respond to PROBE requests.
Changes:
v1 -> v2:
- Add AFI definitions
- Switch to functions such as dev_get_by_name and ip_dev_find to lookup
net devices
v2 -> v3:
Suggested by Willem de Bruijn <willemdebruijn.kernel@gmail.com>
- Add verification of incoming messages before looking up netdev
- Add prefix for PROBE specific defined variables
- Use proc_dointvec_minmax with zero and one for sysctl
- Create struct icmp_ext_echo_iio for parsing incoming packets Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
- Include net/addrconf.h library for ipv6_dev_find
v3 -> v4:
- Use in_addr instead of __be32 for storing IPV4 addresses
- Use IFNAMSIZ to statically allocate space for name in
icmp_ext_echo_iio
Suggested by Willem de Bruijn <willemdebruijn.kernel@gmail.com>
- Use skb_header_pointer to verify fields in incoming message
- Add check to ensure that extobj_hdr.length is valid
- Check to ensure object payload is padded with ASCII NULL characters
when probing by name, as specified by RFC 8335
- Statically allocate buff using IFNAMSIZ
- Add rcu blocking around ipv6_dev_find
- Use __in_dev_get_rcu to access IPV4 addresses of identified
net_device
- Remove check for ICMPV6 PROBE types
v4 -> v5:
- Statically allocate buff to size IFNAMSIZ on declaration
- Remove goto probe in favor of single branch
- Remove strict check for incoming PROBE request padding to nearest
32-bit boundary Reported-by: kernel test robot <lkp@intel.com>
v5 -> v6:
- Add documentation for icmp_echo_enable_probe sysctl
- Remove RCU locking around ipv6_dev_find()
- Assign iio based on ctype
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Roeseler [Tue, 30 Mar 2021 01:45:51 +0000 (18:45 -0700)]
icmp: add response to RFC 8335 PROBE messages
Modify the icmp_rcv function to check PROBE messages and call icmp_echo
if a PROBE request is detected.
Modify the existing icmp_echo function to respond ot both ping and PROBE
requests.
This was tested using a custom modification to the iputils package and
wireshark. It supports IPV4 probing by name, ifindex, and probing by
both IPV4 and IPV6 addresses. It currently does not support responding
to probes off the proxy node (see RFC 8335 Section 2).
The modification to the iputils package is still in development and can
be found here: https://github.com/Juniper-Clinic-2020/iputils.git. It
supports full sending functionality of PROBE requests, but currently
does not parse the response messages, which is why Wireshark is required
to verify the sent and recieved PROBE messages. The modification adds
the ``-e'' flag to the command which allows the user to specify the
interface identifier to query the probed host. An example usage would be
<./ping -4 -e 1 [destination]> to send a PROBE request of ifindex 1 to the
destination node.
Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Roeseler [Tue, 30 Mar 2021 01:45:36 +0000 (18:45 -0700)]
net: add support for sending RFC 8335 PROBE messages
Modify the ping_supported function to support PROBE message types. This
allows tools such as the ping command in the iputils package to be
modified to send PROBE requests through the existing framework for
sending ping requests.
Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Roeseler [Tue, 30 Mar 2021 01:45:29 +0000 (18:45 -0700)]
net: add sysctl for enabling RFC 8335 PROBE messages
Section 8 of RFC 8335 specifies potential security concerns of
responding to PROBE requests, and states that nodes that support PROBE
functionality MUST be able to enable/disable responses and that
responses MUST be disabled by default
Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andre Edich [Mon, 29 Mar 2021 09:45:36 +0000 (11:45 +0200)]
net: phy: lan87xx: fix access to wrong register of LAN87xx
The function lan87xx_config_aneg_ext was introduced to configure
LAN95xxA but as well writes to undocumented register of LAN87xx.
This fix prevents that access.
The function lan87xx_config_aneg_ext gets more suitable for the new
behavior name.
Reported-by: Måns Rullgård <mans@mansr.com> Fixes: 05b35e7eb9a1 ("smsc95xx: add phylib support") Signed-off-by: Andre Edich <andre.edich@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
this is a pull request of 39 patches for net-next/master.
The first two patches update the MAINTAINERS file. One is by me and
removes Dan Murphy from the from m_can and tcan4x5x. The other one is
by Pankaj Sharma and updates the maintainership of the m-can mmio
driver.
The next three patches are by me and update the CAN echo skb handling.
Vincent Mailhol provides 5 patches where Transmitter Delay
Compensation is added CAN bittiming calculation is cleaned up.
The next patch is by me and adds a missing HAS_IOMEM to the grcan
driver.
Michal Simek's patch for the xilinx driver add dev_err_probe()
support.
Arnd Bergmann's patch for the ucan driver fixes a compiler warning.
Stephane Grosjean provides 3 patches for the peak USB drivers, which
add ethtool set_phys_id and CAN one-shot mode.
Xulin Sun's patch removes a not needed return check in the m-can
driver. Torin Cooper-Bennun provides 3 patches for the m-can driver
that add rx-offload support to ensure that skbs are sent from softirq
context. Wan Jiabing's patch for the tcan4x5x driver removes a
duplicate include.
The next 6 patches are by me and target the mcp251xfd driver. They add
devcoredump support, simplify the UINC handling, and add HW timestamp
support.
The remaining 12 patches target the c_can driver. The first 6 are by
me and do generic checkpatch related cleanup work. Dario Binacchi's
patches bring some cleanups and increase the number of usable message
objects from 16 to 64.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 30 Mar 2021 19:59:25 +0000 (12:59 -0700)]
Merge tag 'mlx5-updates-2021-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2021-03-29
Coexistence of CQE compression and HW PTP time-stamp:
From Aya this series improves mlx5 netdev driver to allow
both mlx5 CQE compression (RX descriptor compression, that saves on PCI
transaction) and HW time-stamp PTP to co-exists.
Prior to this series both features were mutually exclusive due to the
nature of CQE compression which reduces the size of RX descriptor for
the price of trimming some data, such as the time-stamp.
In order to allow CQE compression when PTP time stamping is enabled,
We enable it on the regular performance critical RX queues which will
service all the data path traffic that is not PTP.
PTP traffic will be re-directed to dedicated RX queues on which we will
not enable CQE compression and thus keep the time-stamp intact.
Having both features is critical for systems with low PCI BW, e.g.
Multi-Host.
The series will be adding:
1) Infrastructure to create a dedicated RX queue to service the PTP traffic
2) Flow steering plumbing to capture PTP traffic both UDP packets with
destination port 319 and L2 packets with ethertype 0x88F7
3) Steer PTP traffic to the dedicated RX queue.
4) The feature will be enabled when PTP is being configured via the
already existing PTP IOCTL when CQE compression is active, otherwise
no change to the driver flow.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Dario Binacchi [Tue, 2 Mar 2021 21:54:35 +0000 (22:54 +0100)]
can: c_can: add support to 64 message objects
D_CAN controller supports 16, 32, 64 or 128 message objects, comparing
to 32 on C_CAN. AM335x/AM437x Sitara processors and DRA7 SOC all
instantiate a D_CAN controller with 64 message objects, as described
in the "DCAN features" subsection of the CAN chapter of their
technical reference manuals.
The driver policy has been kept unchanged, and as in the previous
version, the first half of the message objects is used for reception
and the second for transmission.
The I/O load is increased only in the case of 64 message objects,
keeping it unchanged in the case of 32. Two 32-bit read accesses are
in fact required, which however remained at 16-bit for configurations
with 32 message objects.
Dario Binacchi [Tue, 2 Mar 2021 21:54:34 +0000 (22:54 +0100)]
can: c_can: prepare to up the message objects number
As pointed by commit c0a9f4d396c9 ("can: c_can: Reduce register
access") the "driver casts the 16 message objects in stone, which is
completely braindead as contemporary hardware has up to 128 message
objects".
The patch prepares the module to extend the number of message objects
beyond the 32 currently managed. This was achieved by transforming the
constants used to manage RX/TX messages into variables without
changing the driver policy.
Dario Binacchi [Tue, 2 Mar 2021 21:54:32 +0000 (22:54 +0100)]
can: c_can: add a comment about IF_RX interface's use
After reading the commit 640916db2bf7 ("can: c_can: Make it SMP safe")
it may sound strange to see the IF_RX interface used by the
can_inval_tx_object function. A comment was added to avoid any
misunderstanding.
can: mcp251xfd: add HW timestamp to RX, TX and error CAN frames
This patch uses the previously added mcp251xfd_skb_set_timestamp()
function to convert the timestamp done by the CAN controller into a
proper skb hw timestamp.
This patch add the HW timestamping infrastructure. The mcp251xfd has a
free running timer of 32 bit width, running at max 40MHz, which wraps
around every 107 seconds. The current timestamp is latched into RX and
TEF objects automatically be the CAN controller.
This patch sets up a cyclecounter, timecounter and delayed worker
infrastructure (which runs every 45 seconds) to convert the timer into
a proper 64 bit based ns timestamp.
| 1f652bb6bae7 can: mcp25xxfd: rx-path: reduce number of SPI core requests to set UINC bit
| 68c0c1c7f966 can: mcp251xfd: tef-path: reduce number of SPI core requests to set UINC bit
the setting of the UINC bit in the TEF and RX FIFO was batched into a
single SPI message consisting of several transfers. All transfers but
the last need to have the cs_change set to 1.
In the original patches the array of prepared transfers is send from
the beginning with the length depending on the number of read TEF/RX
objects. The cs_change of the last transfer is temporarily set to
0 during send.
This patch removes the modification of cs_change by preparing the last
transfer with cs_change to 0 and all other to 1. When sending the SPI
message the driver now starts with an offset into the array, so that
it always ends on the last entry in the array, which has the cs_change
set to 0.
For easier debugging this patch adds dev coredump support to the
driver. A dev coredump is generated in case the chip fails to start or
an error in the interrupt handler is detected.
The dev coredump consists of all chip registers and chip memory, as
well as the driver's internal state of the TEF-, RX- and TX-FIFOs, it
can be analyzed with the mcp251xfd-dump tool of the can-utils:
can: m_can: fix periph RX path: use rx-offload to ensure skbs are sent from softirq context
For peripheral devices, m_can sent skbs directly from a threaded irq
instead of from a softirq context, breaking the tcan4x5x peripheral
driver completely. This patch transitions the driver to use the
rx-offload helper for peripherals, ensuring the skbs are sent from the
correct context, with h/w timestamping to ensure correct ordering.
can: m_can: m_can_chip_config(): enable and configure internal timestamps
This is a prerequisite for transitioning the m_can driver to rx-offload,
which works best with TX and RX timestamps.
The timestamps provided by M_CAN are 16-bit, timed according to the
nominal bit timing, and may be prescaled by a multiplier up to 16. We
choose the highest prescalar so that the timestamp wraps every 2^20 bit
times, or 209 ms at a bus speed of 5 Mbit/s. Timestamps will have a
precision of 16 bit times.
If the CAN net device has been successfully allocated, its private
data structure is impossible to be empty, remove this redundant error
return judgment.
This patch adds "ONE-SHOT" mode support to the following CAN-USB
PEAK-System GmbH interfaces:
- PCAN-USB X6
- PCAN-USB FD
- PCAN-USB Pro FD
- PCAN-Chip USB
- PCAN-USB Pro
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
[mkl: split into two patches] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Michal Simek [Thu, 4 Feb 2021 12:42:48 +0000 (13:42 +0100)]
can: xilinx_can: Simplify code by using dev_err_probe()
Use already prepared dev_err_probe() introduced by commit a787e5400a1c
("driver core: add device probe log helper").
It simplifies EPROBE_DEFER handling.
Also unify message format for similar error cases.
can: grcan: add missing Kconfig dependency to HAS_IOMEM
On ARCHs without IOMEM support the grcan driver fails to link due to
missing iomem functionality. This patch adds the missing Kconfig
dependency to HAS_IOMEM.
Vincent Mailhol [Sat, 6 Mar 2021 05:40:40 +0000 (14:40 +0900)]
can: bittiming: add CAN_KBPS, CAN_MBPS and CAN_MHZ macros
Add three macro to simplify the readability of big bit timing numbers:
- CAN_KBPS: kilobits per second (one thousand)
- CAN_MBPS: megabits per second (one million)
- CAN_MHZ: megahertz per second (one million)
Vincent Mailhol [Wed, 24 Feb 2021 00:20:08 +0000 (09:20 +0900)]
can: bittiming: add calculation for CAN FD Transmitter Delay Compensation (TDC)
The logic for the tdco calculation is to just reuse the normal sample
point: tdco = sp. Because the sample point is expressed in tenth of
percent and the tdco is expressed in time quanta, a conversion is
needed.
At the end,
ssp = tdcv + tdco
= tdcv + sp.
Another popular method is to set tdco to the middle of the bit:
tdc->tdco = can_bit_time(dbt) / 2
During benchmark tests, we could not find a clear advantages for one
of the two methods.
The tdco calculation is triggered each time the data_bittiming is
changed so that users relying on automated calculation can use the
netlink interface the exact same way without need of new parameters.
For example, a command such as:
ip link set canX type can bitrate 500000 dbitrate 4000000 fd on
would trigger the calculation.
The user using CONFIG_CAN_CALC_BITTIMING who does not want automated
calculation needs to manually set tdco to zero.
For example with:
ip link set canX type can tdco 0 bitrate 500000 dbitrate 4000000 fd on
(if the tdco parameter is provided in a previous command, it will be
overwritten).
If tdcv is set to zero (default), it is automatically calculated by
the transiver for each frame. As such, there is no code in the kernel
to calculate it.
tdcf has no automated calculation functions because we could not
figure out a formula for this parameter.
Vincent Mailhol [Wed, 24 Feb 2021 00:20:06 +0000 (09:20 +0900)]
can: netlink: move '=' operators back to previous line (checkpatch fix)
Fix the warning triggered by having an '=' at the beginning of the
line by moving it back to the previous line. Also replace all
indentations with a single space so that future entries can be more
easily added.
Extract of ./scripts/checkpatch.pl -f drivers/net/can/dev/netlink.c:
CHECK: Assignment operator '=' should be on the previous line
+ [IFLA_CAN_BITTIMING_CONST]
+ = { .len = sizeof(struct can_bittiming_const) },
CHECK: Assignment operator '=' should be on the previous line
+ [IFLA_CAN_DATA_BITTIMING]
+ = { .len = sizeof(struct can_bittiming) },
CHECK: Assignment operator '=' should be on the previous line
+ [IFLA_CAN_DATA_BITTIMING_CONST]
+ = { .len = sizeof(struct can_bittiming_const) },
Vincent Mailhol [Wed, 24 Feb 2021 00:20:04 +0000 (09:20 +0900)]
can: add new CAN FD bittiming parameters: Transmitter Delay Compensation (TDC)
At high bit rates, the propagation delay from the TX pin to the RX pin
of the transceiver causes measurement errors: the sample point on the
RX pin might occur on the previous bit.
This issue is addressed in ISO 11898-1 section 11.3.3 "Transmitter
delay compensation" (TDC).
This patch adds two new structures: can_tdc and can_tdc_const in order
to implement this TDC.
The structures are then added to can_priv.
A controller supports TDC if an only if can_priv::tdc_const is not
NULL.
TDC is active if and only if:
- fd flag is on
- can_priv::tdc.tdco is not zero.
It is the driver responsibility to check those two conditions are met.
No new controller modes are introduced (i.e. no CAN_CTRL_MODE_TDC) in
order not to be redundant with above logic.
The names of the parameters are chosen to match existing CAN
controllers specification. References:
- Bosch C_CAN FD8:
https://www.bosch-semiconductors.com/media/ip_modules/pdf_2/c_can_fd8/users_manual_c_can_fd8_r210_1.pdf
- Microchip CAN FD Controller Module:
http://ww1.microchip.com/downloads/en/DeviceDoc/MCP251XXFD-CAN-FD-Controller-Module-Family-Reference-Manual-20005678B.pdf
- SAM E701/S70/V70/V71 Family:
https://www.mouser.com/datasheet/2/268/60001527A-1284321.pdf
can: dev: can_free_echo_skb(): extend to return can frame length
In order to implement byte queue limits (bql) in CAN drivers, the
length of the CAN frame needs to be passed into the networking stack
even if the transmission failed for some reason.
To avoid to calculate this length twice, extend can_free_echo_skb() to
return that value. Convert all users of this function, too.
This patch is the natural extension of commit:
| 9420e1d495e2 ("can: dev: can_get_echo_skb(): extend to return can
| frame length")
So far the creation of the TX echo skb was optional and can be
controlled by the local sender of a CAN frame.
It turns out that the TX echo CAN skb can be piggybacked to carry
information in the driver from the TX- to the TX-complete handler.
Several drivers already use the return value of
can_get_echo_skb() (which is the length of the data field in the CAN
frame) for their number of transferred bytes statistics. The
statistics are not working if CAN echo skbs are disabled.
Another use case is to calculate and set the CAN frame length on the
wire, which is needed for BQL support in both the TX and TX-completion
handler.
For now in can_put_echo_skb(), which is called from the TX handler,
the skb carrying the CAN frame is discarded if no TX echo is
requested, leading to the above illustrated problems.
This patch changes the can_put_echo_skb() function, so that the echo
skb is always generated. If the sender requests no echo, the echo skb
is consumed in __can_get_echo_skb() without being passed into the RX
handler of the networking stack, but the CAN data length and CAN frame
length information is properly returned.
Aya Levin [Wed, 13 Jan 2021 07:54:22 +0000 (09:54 +0200)]
net/mlx5e: Update ethtool setting of CQE compression
Remove restriction blocking configuration of CQE compression when PTP rx
filter is set. Instead turn on indication for RX PTP, and try to reopen
the channels.
Aya Levin [Wed, 20 Jan 2021 14:59:27 +0000 (16:59 +0200)]
net/mlx5e: Allow coexistence of CQE compression and HW TS PTP
Update setting HW time-stamp to allow coexistence with CQE compression.
Turn on RX PTP indication and try to reopen the channels. On success,
coexistence with CQE compression is enabled. Otherwise, fall-back to
turning off CQE compression.
Aya Levin [Tue, 16 Feb 2021 10:32:48 +0000 (12:32 +0200)]
net/mlx5e: Add PTP Flow Steering support
When opening PTP channel with MLX5E_PTP_STATE_RX set, add the
corresponding flow steering rules. Capture UDP packets with destination
port 319 and L2 packets with ethertype 0x88F7 and steer them into the RQ
of the PTP channel.
Add API that manages the flow steering rules to be used in the following
patches via safe_reopen_channels mechanism.
Aya Levin [Sun, 17 Jan 2021 06:58:04 +0000 (08:58 +0200)]
net/mlx5e: Introduce Flow Steering ANY API
Add a new FS API which captures the ANY traffic from the traffic
classifier into a dedicated FS table. The table consists of a group
matching the ethertype and a must-be-last group which contains a default
rule redirecting the unmatched packets back to the RSS logic.
Aya Levin [Thu, 14 Jan 2021 15:26:35 +0000 (17:26 +0200)]
net/mlx5e: Introduce Flow Steering UDP API
Add a new FS API which captures the UDP traffic from the traffic
classifier into a dedicated FS table. This API handles both UDP over
IPv4 and IPv6 in the same manner. The tables (one for UDPv4 and another
for UDPv6) consist of a group matching the UDP destination port and a
must-be-last group which contains a default rule redirecting the
unmatched packets back to the RSS logic.
Aya Levin [Thu, 21 Jan 2021 07:32:52 +0000 (09:32 +0200)]
net/mlx5e: Cleanup Flow Steering level
Flow Steering levels are used to determine the order between the tables.
As of today, each one of these tables follows the TTC table, and hijacks
its traffic, and cannot be combined together for now. Putting them in
the same layer better reflects the situation.
Aya Levin [Thu, 25 Feb 2021 17:55:20 +0000 (19:55 +0200)]
net:mlx5e: Add PTP-TIR and PTP-RQT
Add PTP-TIR and initiate its RQT to allow PTP-RQ to integrate into the
safe-reopen flow on configuration change. Add rx_ptp_support flag on a
profile and turn it on for ETH driver. With this flag set, create a
redirect-RQT for PTP-RQ.
Aya Levin [Sun, 7 Mar 2021 13:55:04 +0000 (15:55 +0200)]
net/mlx5e: Add PTP-RX statistics
Like PTP-TX, once the PTP-RX is opened, corresponding statistics appear.
Add indication that PTP-RX was ever opened: rx_ptp_opened. If any of the
PTP RX or TX were opened, display the PTP channel's statistics.
Aya Levin [Sun, 7 Mar 2021 13:47:37 +0000 (15:47 +0200)]
net/mlx5e: Add RQ to PTP channel
Enhance PTP channel to allow PTP without disabling CQE compression. Add
RQ, TIR and PTP_RX_STATE to PTP channel. When this bit is set, PTP
channel manages its RQ, and PTP traffic is directed to the PTP-RQ which
is not affected by compression.
Aya Levin [Mon, 11 Jan 2021 14:45:21 +0000 (16:45 +0200)]
net/mlx5e: Add states to PTP channel
Add PTP TX state to PTP channel, which indicates the corresponding SQ is
available. Further patches in the set extend PTP channel to include RQ.
The PTP channel state will be used for separation and coexistence of RX
and TX PTP. Enhance conditions to verify the TX PTP state is set.
Haiyang Zhang [Mon, 29 Mar 2021 23:21:35 +0000 (16:21 -0700)]
hv_netvsc: Add error handling while switching data path
Add error handling in case of failure to send switching data path message
to the host.
Reported-by: Shachar Raindel <shacharr@microsoft.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 29 Mar 2021 17:40:49 +0000 (10:40 -0700)]
tcp: fix tcp_min_tso_segs sysctl
tcp_min_tso_segs is now stored in u8, so max value is 255.
255 limit is enforced by proc_dou8vec_minmax().
We can therefore remove the gso_max_segs variable.
Fixes: 47996b489bdc ("tcp: convert elligible sysctls to u8") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 29 Mar 2021 19:25:22 +0000 (12:25 -0700)]
sit: proper dev_{hold|put} in ndo_[un]init methods
After adopting CONFIG_PCPU_DEV_REFCNT=n option, syzbot was able to trigger
a warning [1]
Issue here is that:
- all dev_put() should be paired with a corresponding prior dev_hold().
- A driver doing a dev_put() in its ndo_uninit() MUST also
do a dev_hold() in its ndo_init(), only when ndo_init()
is returning 0.
Otherwise, register_netdevice() would call ndo_uninit()
in its error path and release a refcount too soon.
Fixes: 919067cc845f ("net: add CONFIG_PCPU_DEV_REFCNT") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 29 Mar 2021 18:39:51 +0000 (11:39 -0700)]
ip6_gre: proper dev_{hold|put} in ndo_[un]init methods
After adopting CONFIG_PCPU_DEV_REFCNT=n option, syzbot was able to trigger
a warning [1]
Issue here is that:
- all dev_put() should be paired with a corresponding dev_hold(),
and vice versa.
- A driver doing a dev_put() in its ndo_uninit() MUST also
do a dev_hold() in its ndo_init(), only when ndo_init()
is returning 0.
Otherwise, register_netdevice() would call ndo_uninit()
in its error path and release a refcount too soon.
ip6_gre for example (among others problematic drivers)
has to use dev_hold() in ip6gre_tunnel_init_common()
instead of from ip6gre_newlink_common(), covering
both ip6gre_tunnel_init() and ip6gre_tap_init()/
Note that ip6gre_tunnel_init_common() is not called from
ip6erspan_tap_init() thus we also need to add a dev_hold() there,
as ip6erspan_tunnel_uninit() does call dev_put()
Fixes: 919067cc845f ("net: add CONFIG_PCPU_DEV_REFCNT") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 29 Mar 2021 23:27:54 +0000 (16:27 -0700)]
Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
1GbE Intel Wired LAN Driver Updates 2021-03-29
This series contains updates to igc driver only.
Andre Guedes says:
Add XDP support for the igc driver. The approach implemented by this
series follows the same approach implemented in other Intel drivers as
much as possible for the sake of consistency.
The series is organized in two parts. In the first part, i.e. patches
from 1 to 4, igc_main.c and igc_ptp.c code is refactored in preparation
for landing the XDP support, which is introduced in the second part
(patches from 5 to 8).
As far as code organization is concerned, XDP-related helpers are
defined in a new file, igc_xdp.c, and are called by igc_main.c.
The features added by this series have been tested with the samples
provided in samples/bpf/: xdp1, xdp2, xdp_redirect_cpu, and
xdp_redirect_map.
Upcoming series will add support of UMEM and zero-copy features from
AF_XDP.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Loic Poulain [Mon, 29 Mar 2021 15:39:32 +0000 (17:39 +0200)]
net: mhi: Allow decoupled MTU/MRU
MBIM protocol makes the mhi network interface asymmetric, ingress data
received from MHI is MBIM protocol, possibly containing multiple
aggregated IP packets, while egress data received from network stack is
IP protocol.
This changes allows a 'protocol' to specify its own MRU, that when
specified is used to allocate MHI RX buffers (skb).
For MBIM, Set the default MTU to 1500, which is the usual network MTU
for WWAN IP packets, and MRU to 3.5K (for allocation efficiency),
allowing skb to fit in an usual 4K page (including padding,
skb_shared_info, ...).
Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Loic Poulain [Mon, 29 Mar 2021 15:39:31 +0000 (17:39 +0200)]
net: mhi: Add support for non-linear MBIM skb processing
Currently, if skb is non-linear, due to MHI skb chaining, it is
linearized in MBIM RX handler prior MBIM decoding, causing extra
allocation and copy that can be as large as the maximum MBIM frame
size (32K).
This change introduces MBIM decoding for non-linear skb, allowing to
process 'large' non-linear MBIM packets without skb linearization.
The IP packets are simply extracted from the MBIM frame using the
skb_copy_bits helper.
Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Mon, 29 Mar 2021 11:23:54 +0000 (12:23 +0100)]
ieee802154: hwsim: remove redundant initialization of variable res
The variable res is being initialized with a value that is
never read and it is being updated later with a new value.
The initialization is redundant and can be removed.
Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Mon, 29 Mar 2021 15:57:31 +0000 (17:57 +0200)]
Documentation: net: Document resilient next-hop groups
Add a document describing the principles behind resilient next-hop groups,
and some notes about how to configure and offload them.
Suggested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Mon, 29 Mar 2021 12:44:27 +0000 (20:44 +0800)]
net: mdio: Correct function name mdio45_links_ok() in comment
Fix the following make W=1 kernel build warning:
drivers/net/mdio.c:95: warning: expecting prototype for mdio_link_ok(). Prototype was for mdio45_links_ok() instead
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Mon, 29 Mar 2021 12:42:57 +0000 (20:42 +0800)]
net: bonding: Correct function name bond_change_active_slave() in comment
Fix the following make W=1 kernel build warning:
drivers/net/bonding/bond_main.c:982: warning: expecting prototype for change_active_interface(). Prototype was for bond_change_active_slave() instead
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Mon, 29 Mar 2021 12:40:46 +0000 (20:40 +0800)]
net: phy: Correct function name mdiobus_register_board_info() in comment
Fix the following make W=1 kernel build warning:
drivers/net/phy/mdio-boardinfo.c:63: warning: expecting prototype for mdio_register_board_info(). Prototype was for mdiobus_register_board_info() instead
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 29 Mar 2021 20:37:26 +0000 (13:37 -0700)]
Merge branch 'mlxsw-sampling-fixes'
Ido Schimmel says:
====================
mlxsw: Two sampling fixes
This patchset fixes two bugs in recent sampling submissions.
The first fix, in patch #3, prevents matchall rules with sample action
to be added in front of flower rules on egress. Patches #1-#2 are
preparations meant at avoiding similar bugs in the future. Patch #4 is a
selftest.
The second fix, in patch #5, prevents sampling from being enabled on a
port if already enabled. Patch #6 is a selftest.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Mon, 29 Mar 2021 10:09:47 +0000 (13:09 +0300)]
mlxsw: spectrum: Veto sampling if already enabled on port
The per-port sampling triggers (i.e., ingress / egress) cannot be
enabled twice. Meaning, the below configuration will not result in
packets being sampled twice:
# tc filter add dev swp1 ingress matchall skip_sw action sample rate 100 group 1
# tc filter add dev swp1 ingress matchall skip_sw action sample rate 100 group 1
Therefore, reject such configurations.
Fixes: 90f53c53ec4a ("mlxsw: spectrum: Start using sampling triggers hash table") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Perform the priority check earlier in the function instead of repeating
it for every action. This fixes a bug that allowed matchall rules with
sample action to be added in front of flower rules on egress.
Fixes: 54d0e963f683 ("mlxsw: spectrum_matchall: Add support for egress sampling") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Mon, 29 Mar 2021 10:09:44 +0000 (13:09 +0300)]
mlxsw: spectrum_matchall: Convert if statements to a switch statement
Previous patch moved the protocol check out of the action check, so
these if statements can now be converted to a switch statement. Perform
the conversion.
Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>