Colin Ian King [Mon, 27 Jul 2020 14:17:12 +0000 (15:17 +0100)]
qed: fix assignment of n_rq_elems to incorrect params field
Currently n_rq_elems is being assigned to params.elem_size instead of the
field params.num_elems. Coverity is detecting this as a double assingment
to params.elem_size and reporting this as an usused value on the first
assignment. Fix this.
Addresses-Coverity: ("Unused value") Fixes: b6db3f71c976 ("qed: simplify chain allocation with init params struct") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
sfc: driver for EF100 family NICs, part 1
EF100 is a new NIC architecture under development at Xilinx, based
partly on existing Solarflare technology. As many of the hardware
interfaces resemble EF10, support is implemented within the 'sfc'
driver, which previous patch series "commonised" for this purpose.
In order to maintain bisectability while splitting into patches of a
reasonable size, I had to do a certain amount of back-and-forth with
stubs for things that the common code may try to call, mainly because
we can't do them until we've set up MCDI, but we can't set up MCDI
without probing the event queues, at which point a lot of the common
machinery becomes reachable from event handlers.
Consequently, this first series doesn't get as far as actually sending
and receiving packets. I have a second series ready to follow it
which implements the datapath (and a few other things like ethtool).
Changes from v4:
* Fix build on CONFIG_RETPOLINE=n by using plain prototypes instead
of INDIRECT_CALLABLE_DECLARE.
Changes from v3:
* combine both drivers (sfc_ef100 and sfc) into a single module, to
make non-modular builds work. Patch #4 now adds a few indirections
to support this; the ones in the RX and TX path use indirect-call-
wrappers to minimise the performance impact.
Changes from v2:
* remove MODULE_VERSION.
* call efx_destroy_reset_workqueue() from ef100_exit_module().
* correct uint32_ts to u32s. While I was at it, I fixed a bunch of
other style issues in the function-control-window code.
All in patch #4.
Changes from v1:
* kernel test robot spotted a link error when sfc_ef100 was built
without mdio. It turns out the thing we were trying to link to
was a bogus thing to do on anything but Falcon, so new patch #1
removes it from this driver.
* fix undeclared symbols in patch #4 by shuffling around prototypes
and #includes and adding 'static' where appropriate.
* fix uninitialised variable 'rc2' in patch #7.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Mon, 27 Jul 2020 11:56:26 +0000 (12:56 +0100)]
sfc_ef100: PHY probe stub
We can't actually do the MCDI to probe it fully until we have working
MCDI, which comes later, but we need efx->phy_data to be allocated so
that when we get MCDI events the link-state change handler doesn't
NULL-dereference.
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Mon, 27 Jul 2020 11:55:55 +0000 (12:55 +0100)]
sfc: skeleton EF100 PF driver
No TX or RX path, no MCDI, not even an ifup/down handler.
Besides stubs, the bulk of the patch deals with reading the Xilinx
extended PCIe capability, which tells us where to find our BAR.
Though in the same module, EF100 has its own struct pci_driver,
which is named sfc_ef100.
A small number of additional nic_type methods are added; those in the
TX (tx_enqueue) and RX (rx_packet) paths are called through indirect
call wrappers to minimise the performance impact.
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Mon, 27 Jul 2020 11:54:55 +0000 (12:54 +0100)]
sfc: remove efx_ethtool_nway_reset()
An MDIO-based n-way restart does not make sense for any of the NICs
supported by this driver, nor for the coming EF100.
Unlike on Falcon (which was already split off into a separate driver),
the PHY on all of Siena, EF10 and EF100 is managed by MC firmware.
While Siena can talk to the PHY over MDIO, doing so for anything other
than debugging purposes (mdio_mii_ioctl) is likely to confuse the
firmware.
(According to the SFC firmware team, this support was originally added
to the Siena driver early in the development of that product, before
it was decided to have firmware manage the PHY.)
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 27 Jul 2020 19:20:40 +0000 (12:20 -0700)]
Merge branch 'Add-PRP-driver'
Murali Karicheri says:
====================
Add PRP driver
This series is dependent on the following patches sent out to
netdev list. All (1-3) are already merged to net/master as of
sending this, but not on the net-next master branch. So need
to apply them to net-next before applying this series. v3 of
the iproute2 patches can be merged to work with this series
as there are no updates since then.
This series adds support for Parallel Redundancy Protocol (PRP)
in the Linux HSR driver as defined in IEC-62439-3. PRP Uses a
Redundancy Control Trailer (RCT) the format of which is
similar to HSR Tag. This is used for implementing redundancy.
RCT consists of 6 bytes similar to HSR tag and contain following
fields:-
- 16-bit sequence number (SeqNr);
- 4-bit LAN identifier (LanId);
- 12 bit frame size (LSDUsize);
- 16-bit suffix (PRPsuffix).
The PRPsuffix identifies PRP frames and distinguishes PRP frames
from other protocols that also append a trailer to their useful
data. The LSDUsize field allows the receiver to distinguish PRP
frames from random, nonredundant frames as an additional check.
LSDUsize is the size of the Ethernet payload inclusive of the
RCT. Sequence number along with LanId is used for duplicate
detection and discard.
PRP node is also known as Dual Attached Node (DAN-P) since it
is typically attached to two different LAN for redundancy.
DAN-P duplicates each of L2 frames and send it over the two
Ethernet links. Each outgoing frame is appended with RCT.
Unlike HSR, these are added to the end of L2 frame and will be
treated as pad by bridges and therefore would be work with
traditional bridges or switches, where as HSR wouldn't as Tag
is prefixed to the Ethenet frame. At the remote end, these are
received and the duplicate frame is discarded before the stripped
frame is send up the networking stack. Like HSR, PRP also sends
periodic Supervision frames to the network. These frames are
received and MAC address from the SV frames are populated in a
database called Node Table. The above functions are grouped into
a block called Link Redundancy Entity (LRE) in the IEC spec.
As there are many similarities between HSR and PRP protocols,
this patch re-uses the code from HSR driver to implement PRP
driver. As per feedback from the RFC series, the implementation
uses the existing HSR Netlink socket interface to create the
PRP interface by adding a new proto parameter to the ip link
command to identify the PRP protocol. iproute2 is enhanced to
implement this new parameter. The hsr_netlink.c is enhanced
to handle the new proto parameter. As suggested during the RFC
review, the driver introduced a proto_ops structure to hold
protocol specfic functions to handle HSR and PRP specific
function pointers and use them in the code based on the
protocol to handle protocol specific part differently in the
driver.
Please review this and provide me feedback so that I can work to
incorporate them and spin the next version if needed.
The patch was tested using two TI AM57x IDK boards for PRP which
are connected back to back over two CPSW Ethernet ports.
To build, enable CONFIG_HSR=y or m
make omap2plus_defconfig
make zImage; make modules; make dtbs
Copy the zImage and dtb files to the file system on SD card
and power on the AM572x boards.
This can be tested on any platforms with 2 Ethernet interfaces.
So will appreciate if you can give it a try and provide your
Tested-by.
Command to create PRP interface
-------------------------------
ifconfig eth0 0.0.0.0 down
ifconfig eth1 0.0.0.0 down
ifconfig eth0 hw ether 70:FF:76:1C:0E:8C
ifconfig eth1 hw ether 70:FF:76:1C:0E:8C
ifconfig eth0 up
ifconfig eth1 up
ip link add name prp0 type hsr slave1 eth0 slave2 eth1 supervision 45 proto 1
ifconfig prp0 192.168.2.10
ifconfig eth0 0.0.0.0 down
ifconfig eth1 0.0.0.0 down
ifconfig eth0 hw ether 70:FF:76:1C:0E:8D
ifconfig eth1 hw ether 70:FF:76:1C:0E:8D
ifconfig eth0 up
ifconfig eth1 up
ip link add name prp0 type hsr slave1 eth0 slave2 eth1 supervision 45 proto 1
ifconfig prp0 192.168.2.20
command to show node table
----------------------------
Ping the peer board after the prp0 interface is up.
The remote node (DAN-P) will be shown in the node table as below.
Try to capture the raw PRP frames at the eth0 interface as
tcpdump -i eth0 -xxx
Sample Supervision frames and ARP frames shown below.
==================================================================================
Successive Supervision frames captured with tcpdump (with RCT at the end):
Other tests done.
- Connect a SAN (eth0 and eth1 without prp interface) and
do ping test from eth0 (192.168.2.40) to prp0 (192.168.2.10)
verify the SAN node shows at the correct link A and B as shown
in the node table dump
- Regress HSR interface using 3 nodes connected in a ring topology.
create hsr link version 0. Do iperf3 test between all nodes
create hsr link version 1. Do iperf3 test between all nodes.
ifconfig eth0 0.0.0.0 down
ifconfig eth1 0.0.0.0 down
ifconfig eth0 hw ether 70:FF:76:1C:0E:8C
ifconfig eth1 hw ether 70:FF:76:1C:0E:8C
ifconfig eth0 up
ifconfig eth1 up
ip link add name hsr0 type hsr slave1 eth0 slave2 eth1 supervision 45 version 0
ifconfig hsr0 192.168.2.10
HSR V1
ifconfig eth0 0.0.0.0 down
ifconfig eth1 0.0.0.0 down
ifconfig eth0 hw ether 70:FF:76:1C:0E:8C
ifconfig eth1 hw ether 70:FF:76:1C:0E:8C
ifconfig eth0 up
ifconfig eth1 up
ip link add name hsr0 type hsr slave1 eth0 slave2 eth1 supervision 45 version 1
ifconfig hsr0 192.168.2.10
Logs at
DUT-1 : https://pastebin.ubuntu.com/p/6PSJbZwQ6y/
DUT-2 : https://pastebin.ubuntu.com/p/T8TqJsPRHc/
DUT-3 : https://pastebin.ubuntu.com/p/VNzpv6HzKj/
- Build tests :-
Build with CONFIG_HSR=m
allmodconfig build
build with CONFIG_HSR=y and rebuild with sparse checker
make C=1 zImage; make modules
Version history:
v5 : Fixed comments about Kconfig changes on Patch 1/7 against v4
Rebased to netnext/master branch.
v4 : fixed following vs v3
reverse xmas tree for local variables
check for return type in call to skb_put_padto()
v3 : Separated bug fixes from this series and send them for immediate merge
But for that this is same as v2.
v2 : updated comments on RFC. Following are the main changes:-
- Removed the hsr_prp prefix
- Added PRP information in header files to indicate
the support for PRP explicitely
- Re-use netlink socket interface with an added
parameter proto for identifying PRP.
- Use function pointers using a proto_ops struct
to do things differently for PRP vs HSR.
RFC: initial version posted and discussed at
https://www.spinics.net/lists/netdev/msg656229.html
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Print PRP specific information from node table as part of debugfs
node table display. Also display the node as DAN-H or DAN-P depending
on the info from node table.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
DAN-P (Dual Attached Nodes PRP) nodes are expected to receive
traditional IP packets as well as PRP (Parallel Redundancy
Protocol) tagged (trailer) packets. PRP trailer is 6 bytes
of PRP protocol unit called RCT, Redundancy Control Trailer
(RCT) similar to HSR tag. PRP network can have traditional
devices such as bridges/switches or PC attached to it and
should be able to communicate. Regular Ethernet devices treat
the RCT as pads. This patch adds logic to format L2 frames
from network stack to add a trailer (RCT) and send it as
duplicates over the slave interfaces when the protocol is
PRP as per IEC 62439-3. At the ingress, it strips the trailer,
do duplicate detection and rejection and forward a stripped
frame up the network stack. PRP device should accept frames
from Singly Attached Nodes (SAN) and thus the driver mark
the link where the frame came from in the node table.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: hsr: define and use proto_ops ptrs to handle hsr specific frames
As a preparatory patch to introduce PRP, refactor the code specific to
handling HSR frames into separate functions and call them through
proto_ops function pointers.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: prp: add supervision frame generation utility function
Add support for generation of PRP supervision frames. For PRP,
supervision frame format is similar to HSR version 0, but have
a PRP Redundancy Control Trailer (RCT) added and uses a different
message type, PRP_TLV_LIFE_CHECK_DD. Also update
is_supervision_frame() to include the new message type used for
PRP supervision frame.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: hsr: introduce protocol specific function pointers
As a preparatory patch to introduce support for PRP protocol, add a
protocol ops ptr in the private hsr structure to hold function
pointers as some of the functions at protocol level packet
handling is different for HSR vs PRP. It is expected that PRP will
add its of set of functions for protocol handling. Modify existing
hsr_announce() function to call proto_ops->send_sv_frame() to send
supervision frame for HSR. This is expected to be different for PRP.
So introduce a ops function ptr, send_sv_frame() for the same and
initialize it to send_hsr_supervsion_frame(). Modify hsr_announce()
to call proto_ops->send_sv_frame().
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
hsr: enhance netlink socket interface to support PRP
Parallel Redundancy Protocol (PRP) is another redundancy protocol
introduced by IEC 63439 standard. It is similar to HSR in many
aspects:-
- Use a pair of Ethernet interfaces to created the PRP device
- Use a 6 byte redundancy protocol part (RCT, Redundancy Check
Trailer) similar to HSR Tag.
- Has Link Redundancy Entity (LRE) that works with RCT to implement
redundancy.
Key difference is that the protocol unit is a trailer instead of a
prefix as in HSR. That makes it inter-operable with tradition network
components such as bridges/switches which treat it as pad bytes,
whereas HSR nodes requires some kind of translators (Called redbox) to
talk to regular network devices. This features allows regular linux box
to be converted to a DAN-P box. DAN-P stands for Dual Attached Node - PRP
similar to DAN-H (Dual Attached Node - HSR).
Add a comment at the header/source code to explicitly state that the
driver files also handles PRP protocol as well.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
qed: fix the allocation of the chains with an external PBL
Dan reports static checker warning:
"The patch 9b6ee3cf95d3: "qed: sanitize PBL chains allocation" from Jul
23, 2020, leads to the following static checker warning:
drivers/net/ethernet/qlogic/qed/qed_chain.c:299 qed_chain_alloc_pbl()
error: uninitialized symbol 'pbl_virt'.
drivers/net/ethernet/qlogic/qed/qed_chain.c
249 static int qed_chain_alloc_pbl(struct qed_dev *cdev, struct qed_chain *chain)
250 {
251 struct device *dev = &cdev->pdev->dev;
252 struct addr_tbl_entry *addr_tbl;
253 dma_addr_t phys, pbl_phys;
254 __le64 *pbl_virt;
^^^^^^^^^^^^^^^^
[...]
271 if (chain->b_external_pbl)
272 goto alloc_pages;
^^^^^^^^^^^^^^^^ uninitialized
[...]
298 /* Fill the PBL table with the physical address of the page */
299 pbl_virt[i] = cpu_to_le64(phys);
^^^^^^^^^^^
[...]
"
This issue was introduced with commit c3a321b06a80 ("qed: simplify
initialization of the chains with an external PBL"), when
chain->pbl_sp.table_virt initialization was moved up to
qed_chain_init_params().
Fix it by initializing pbl_virt with an already filled chain struct field.
Fixes: c3a321b06a80 ("qed: simplify initialization of the chains with an external PBL") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 27 Jul 2020 18:47:33 +0000 (11:47 -0700)]
Merge branch 'bnxt_en-update'
Michael Chan says:
====================
bnxt_en update.
This patchset removes the PCIe histogram and other debug register
data from ethtool -S. The removed data are not counters and they have
very large and constantly fluctuating values that are not suitable for
the ethtool -S decimal counter display.
The rest of the patches implement counter rollover for all hardware
counters that are not 64-bit counters. Different generations of
hardware have different counter widths. The driver will now query
the counter widths of all counters from firmware and implement
rollover support on all non-64-bit counters.
The last patch adds the PCIe histogram and other PCIe register data back
using the ethtool -d interface.
Add support to dump PXP registers and PCIe statistics.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 27 Jul 2020 09:40:44 +0000 (05:40 -0400)]
bnxt_en: Switch over to use the 64-bit software accumulated counters.
Now we can report all the full 64-bit CPU endian software accumulated
counters instead of the hw counters, some of which may be less than
64-bit wide. Define the necessary macros to access the software
counters.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 27 Jul 2020 09:40:43 +0000 (05:40 -0400)]
bnxt_en: Accumulate all counters.
Now that we have the infrastructure in place, add the new function
bnxt_accumulate_all_stats() to periodically accumulate and check for
counter rollover of all ring stats and port stats.
A chip bug was also discovered that could cause some ring counters to
become 0 during DMA. Workaround by ignoring zeros on the affected
chips.
Some older frimware will reset port counters during ifdown. We need
to check for that and free the accumulated port counters during ifdown
to prevent bogus counter overflow detection during ifup.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 27 Jul 2020 09:40:42 +0000 (05:40 -0400)]
bnxt_en: Retrieve hardware masks for port counters.
If supported by newer firmware, make the firmware call to query all
the port counter masks. If not supported, assume 40-bit port
counter masks.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 27 Jul 2020 09:40:41 +0000 (05:40 -0400)]
bnxt_en: Retrieve hardware counter masks from firmware if available.
Newer firmware has a new call HWRM_FUNC_QSTATS_EXT to retrieve the
masks of all ring counters. Make this call when supported to
initialize the hardware masks of all ring counters. If the call
is not available, assume 48-bit ring counter masks on P5 chips.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 27 Jul 2020 09:40:40 +0000 (05:40 -0400)]
bnxt_en: Allocate additional memory for all statistics blocks.
Some of these DMAed hardware counters are not full 64-bit counters and
so we need to accumulate them as they overflow. Allocate copies of these
DMA statistics memory blocks with the same size for accumulation. The
hardware counter widths are also counter specific so we allocate
memory for masks that correspond to each counter.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 27 Jul 2020 09:40:39 +0000 (05:40 -0400)]
bnxt_en: Refactor statistics code and structures.
The driver manages multiple statistics structures of different sizes.
They are all allocated, freed, and handled practically the same. Define
a new bnxt_stats_mem structure and common allocation and free functions
for all staistics memory blocks.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 27 Jul 2020 09:40:38 +0000 (05:40 -0400)]
bnxt_en: Use macros to define port statistics size and offset.
The port statistics structures have hard coded padding and offset.
Define macros to make this look cleaner.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 27 Jul 2020 09:40:37 +0000 (05:40 -0400)]
bnxt_en: Update firmware interface to 1.10.1.54.
Main changes are 200G support and fixing the definitions of discard and
error counters to match the hardware definitions.
Because the HWRM_PORT_PHY_QCFG message size has now exceeded the max.
encapsulated response message size of 96 bytes from the PF to the VF,
we now need to cap this message to 96 bytes for forwarding. The forwarded
response only needs to contain the basic link status and speed information
and can be capped without adding the new information.
v2: Fix bnxt_re compile error.
Cc: Selvin Xavier <selvin.xavier@broadcom.com> Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_en: Remove PCIe non-counters from ethtool statistics
Remove PCIe non-counters display from ethtool statistics, as
they are not simple counters but register dump. The next few
patches will add logic to detect counter roll-over and it won't
work with these PCIe non-counters.
There will be a follow up patch to get PCIe information via
ethtool register dump.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Cited commit mistakenly copied provided option to 'val' instead of to
'mfc':
```
- if (copy_from_user(&mfc, optval, sizeof(mfc))) {
+ if (copy_from_sockptr(&val, optval, sizeof(val))) {
```
Fix this by copying the option to 'mfc'.
selftest router_multicast.sh before:
$ ./router_multicast.sh
smcroutectl: Unknown or malformed IPC message 'a' from client.
smcroutectl: failed removing multicast route, does not exist.
TEST: mcast IPv4 [FAIL]
Multicast not received on first host
TEST: mcast IPv6 [ OK ]
smcroutectl: Unknown or malformed IPC message 'a' from client.
smcroutectl: failed removing multicast route, does not exist.
TEST: RPF IPv4 [FAIL]
Multicast not received on first host
TEST: RPF IPv6 [ OK ]
selftest router_multicast.sh after:
$ ./router_multicast.sh
TEST: mcast IPv4 [ OK ]
TEST: mcast IPv6 [ OK ]
TEST: RPF IPv4 [ OK ]
TEST: RPF IPv6 [ OK ]
Fixes: 01ccb5b48f08 ("net/ipv4: switch ip_mroute_setsockopt to sockptr_t") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
net/smc: unique reason code for exceeded max dmb count
When the maximum dmb buffer limit for an ism device is reached no more
dmb buffers can be registered. When this happens the reason code is set
to SMC_CLC_DECL_MEM indicating out-of-memory. This is the same reason
code that is used when no memory could be allocated for the new dmb
buffer.
This is confusing for users when they see this error but there is more
memory available. To solve this set a separate new reason code when the
maximum dmb limit exceeded.
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
s390/ism: indicate correct error reason in ism_alloc_dmb()
When the ism driver allocates a new dmb in ism_alloc_dmb() it must
first check for and reserve a slot in the sba bitmap. When
find_next_zero_bit() finds no free slot then the return code is -ENOMEM.
This code conflicts with the error when the alloc() fails later in the
code. As a result of that the caller can not differentiate
between out-of-memory conditions and sba-bitmap-full conditions.
Fix that by using the return code -ENOSPC when the sba slot
reservation failed.
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Sun, 26 Jul 2020 10:58:27 +0000 (12:58 +0200)]
sfc: drop unnecessary list_empty
list_for_each_safe is able to handle an empty list.
The only effect of avoiding the loop is not initializing the
index variable.
Drop list_empty tests in cases where these variables are not
used.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
The UDP reuseport conflict was a little bit tricky.
The net-next code, via bpf-next, extracted the reuseport handling
into a helper so that the BPF sk lookup code could invoke it.
At the same time, the logic for reuseport handling of unconnected
sockets changed via commit efc6b6f6c3113e8b203b9debfb72d81e0f3dcace
which changed the logic to carry on the reuseport result into the
rest of the lookup loop if we do not return immediately.
This requires moving the reuseport_has_conns() logic into the callers.
While we are here, get rid of inline directives as they do not belong
in foo.c files.
The other changes were cases of more straightforward overlapping
modifications.
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'riscv-for-linus-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux into master
Pull RISC-V fixes from Palmer Dabbelt:
"A few more fixes this week:
- A fix to avoid using SBI calls during kasan initialization, as the
SBI calls themselves have not been probed yet.
- Three fixes related to systems with multiple memory regions"
* tag 'riscv-for-linus-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Parse all memory blocks to remove unusable memory
RISC-V: Do not rely on initrd_start/end computed during early dt parsing
RISC-V: Set maximum number of mapped pages correctly
riscv: kasan: use local_tlb_flush_all() to avoid uninitialized __sbi_rfence
Merge tag 'x86-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull x86 fixes from Ingo Molnar:
"Misc fixes:
- Fix a section end page alignment assumption that was causing
crashes
- Fix ORC unwinding on freshly forked tasks which haven't executed
yet and which have empty user task stacks
- Fix the debug.exception-trace=1 sysctl dumping of user stacks,
which was broken by recent maccess changes"
* tag 'x86-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/dumpstack: Dump user space code correctly again
x86/stacktrace: Fix reliable check for empty user task stacks
x86/unwind/orc: Fix ORC for newly forked tasks
x86, vmlinux.lds: Page-align end of ..page_aligned sections
Merge tag 'perf-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull uprobe fix from Ingo Molnar:
"Fix an interaction/regression between uprobes based shared library
tracing & GDB"
* tag 'perf-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
uprobes: Change handle_swbp() to send SIGTRAP with si_code=SI_KERNEL, to fix GDB regression
Merge tag 'timers-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull timer fix from Ingo Molnar:
"Fix a suspend/resume regression (crash) on TI AM3/AM4 SoC's"
* tag 'timers-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/drivers/timer-ti-dm: Fix suspend and resume for am3 and am4
Merge tag 'sched-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull scheduler fixes from Ingo Molnar:
"Fix a race introduced by the recent loadavg race fix, plus add a debug
check for a hard to debug case of bogus wakeup function flags"
* tag 'sched-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Warn if garbage is passed to default_wake_function()
sched: Fix race against ptrace_freeze_trace()
Merge tag 'efi-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull EFI fixes from Ingo Molnar:
"Various EFI fixes:
- Fix the layering violation in the use of the EFI runtime services
availability mask in users of the 'efivars' abstraction
- Revert build fix for GCC v4.8 which is no longer supported
- Clean up some x86 EFI stub details, some of which are borderline
bugs that copy around garbage into padding fields - let's fix these
out of caution.
- Fix build issues while working on RISC-V support
- Avoid --whole-archive when linking the stub on arm64"
* tag 'efi-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi: Revert "efi/x86: Fix build with gcc 4"
efi/efivars: Expose RT service availability via efivars abstraction
efi/libstub: Move the function prototypes to header file
efi/libstub: Fix gcc error around __umoddi3 for 32 bit builds
efi/libstub/arm64: link stub lib.a conditionally
efi/x86: Only copy upto the end of setup_header
efi/x86: Remove unused variables
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net into master
Pull networking fixes from David Miller:
1) Fix RCU locaking in iwlwifi, from Johannes Berg.
2) mt76 can access uninitialized NAPI struct, from Felix Fietkau.
3) Fix race in updating pause settings in bnxt_en, from Vasundhara
Volam.
4) Propagate error return properly during unbind failures in ax88172a,
from George Kennedy.
5) Fix memleak in adf7242_probe, from Liu Jian.
6) smc_drv_probe() can leak, from Wang Hai.
7) Don't muck with the carrier state if register_netdevice() fails in
the bonding driver, from Taehee Yoo.
8) Fix memleak in dpaa_eth_probe, from Liu Jian.
9) Need to check skb_put_padto() return value in hsr_fill_tag(), from
Murali Karicheri.
10) Don't lose ionic RSS hash settings across FW update, from Shannon
Nelson.
11) Fix clobbered SKB control block in act_ct, from Wen Xu.
12) Missing newlink in "tx_timeout" sysfs output, from Xiongfeng Wang.
13) IS_UDPLITE cleanup a long time ago, incorrectly handled
transformations involving UDPLITE_RECV_CC. From Miaohe Lin.
14) Unbalanced locking in netdevsim, from Taehee Yoo.
15) Suppress false-positive error messages in qed driver, from Alexander
Lobakin.
16) Out of bounds read in ax25_connect and ax25_sendmsg, from Peilin Ye.
17) Missing SKB release in cxgb4's uld_send(), from Navid Emamdoost.
18) Uninitialized value in geneve_changelink(), from Cong Wang.
19) Fix deadlock in xen-netfront, from Andera Righi.
19) flush_backlog() frees skbs with IRQs disabled, so should use
dev_kfree_skb_irq() instead of kfree_skb(). From Subash Abhinov
Kasiviswanathan.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits)
drivers/net/wan: lapb: Corrected the usage of skb_cow
dev: Defer free of skbs in flush_backlog
qrtr: orphan socket in qrtr_release()
xen-netfront: fix potential deadlock in xennet_remove()
flow_offload: Move rhashtable inclusion to the source file
geneve: fix an uninitialized value in geneve_changelink()
bonding: check return value of register_netdevice() in bond_newlink()
tcp: allow at most one TLP probe per flight
AX.25: Prevent integer overflows in connect and sendmsg
cxgb4: add missing release on skb in uld_send()
net: atlantic: fix PTP on AQC10X
AX.25: Prevent out-of-bounds read in ax25_sendmsg()
sctp: shrink stream outq when fails to do addstream reconf
sctp: shrink stream outq only when new outcnt < old outcnt
AX.25: Fix out-of-bounds read in ax25_connect()
enetc: Remove the mdio bus on PF probe bailout
net: ethernet: ti: add NETIF_F_HW_TC hw feature flag for taprio offload
net: ethernet: ave: Fix error returns in ave_init
drivers/net/wan/x25_asy: Fix to make it work
ipvs: fix the connection sync failed in some cases
...
riscv: Parse all memory blocks to remove unusable memory
Currently, maximum physical memory allowed is equal to -PAGE_OFFSET.
That's why we remove any memory blocks spanning beyond that size. However,
it is done only for memblock containing linux kernel which will not work
if there are multiple memblocks.
Process all memory blocks to figure out how much memory needs to be removed
and remove at the end instead of updating the memblock list in place.
RISC-V: Do not rely on initrd_start/end computed during early dt parsing
Currently, initrd_start/end are computed during early_init_dt_scan
but used during arch_setup. We will get the following panic if initrd is used
and CONFIG_DEBUG_VIRTUAL is turned on.
To avoid the error, initrd_start/end can be computed from phys_initrd_start/size
in setup itself. It also improves the initrd placement by aligning the start
and size with the page size.
Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code") Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Xie He [Fri, 24 Jul 2020 16:33:47 +0000 (09:33 -0700)]
drivers/net/wan: lapb: Corrected the usage of skb_cow
This patch fixed 2 issues with the usage of skb_cow in LAPB drivers
"lapbether" and "hdlc_x25":
1) After skb_cow fails, kfree_skb should be called to drop a reference
to the skb. But in both drivers, kfree_skb is not called.
2) skb_cow should be called before skb_push so that is can ensure the
safety of skb_push. But in "lapbether", it is incorrectly called after
skb_push.
More details about these 2 issues:
1) The behavior of calling kfree_skb on failure is also the behavior of
netif_rx, which is called by this function with "return netif_rx(skb);".
So this function should follow this behavior, too.
2) In "lapbether", skb_cow is called after skb_push. This results in 2
logical issues:
a) skb_push is not protected by skb_cow;
b) An extra headroom of 1 byte is ensured after skb_push. This extra
headroom has no use in this function. It also has no use in the
upper-layer function that this function passes the skb to
(x25_lapb_receive_frame in net/x25/x25_dev.c).
So logically skb_cow should instead be called before skb_push.
Cc: Eric Dumazet <edumazet@google.com> Cc: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Xie He <xie.he.0141@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 25 Jul 2020 03:03:28 +0000 (20:03 -0700)]
Merge branch 'net-dsa-mv88e6xxx-port-mtu-support'
Chris Packham says:
====================
net: dsa: mv88e6xxx: port mtu support
This series connects up the mv88e6xxx switches to the dsa infrastructure for
configuring the port MTU. The first patch is also a bug fix which might be a
candiatate for stable.
I've rebased this series on top of net-next/master to pick up Andrew's change
for the gigabit switches. Patch 1 and 2 are unchanged (aside from adding
Andrew's Reviewed-by). Patch 3 is reworked to make use of the existing mtu
support.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Chris Packham [Thu, 23 Jul 2020 23:21:22 +0000 (11:21 +1200)]
net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU
Some of the chips in the mv88e6xxx family don't support jumbo
configuration per port. But they do have a chip-wide max frame size that
can be used. Use this to approximate the behaviour of configuring a port
based MTU.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Chris Packham [Thu, 23 Jul 2020 23:21:21 +0000 (11:21 +1200)]
net: dsa: mv88e6xxx: Support jumbo configuration on 6190/6190X
The MV88E6190 and MV88E6190X both support per port jumbo configuration
just like the other GE switches. Install the appropriate ops.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Chris Packham [Thu, 23 Jul 2020 23:21:20 +0000 (11:21 +1200)]
net: dsa: mv88e6xxx: MV88E6097 does not support jumbo configuration
The MV88E6097 chip does not support configuring jumbo frames. Prior to
commit 5f4366660d65 only the 6352, 6351, 6165 and 6320 chips configured
jumbo mode. The refactor accidentally added the function for the 6097.
Remove the erroneous function pointer assignment.
Fixes: 5f4366660d65 ("net: dsa: mv88e6xxx: Refactor setting of jumbo frames") Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
IRQs are disabled when freeing skbs in input queue.
Use the IRQ safe variant to free skbs here.
Fixes: 145dd5f9c88f ("net: flush the softnet backlog in process context") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
RISC-V: Set maximum number of mapped pages correctly
Currently, maximum number of mapper pages are set to the pfn calculated
from the memblock size of the memblock containing kernel. This will work
until that memblock spans the entire memory. However, it will be set to
a wrong value if there are multiple memblocks defined in kernel
(e.g. with efi runtime services).
Set the the maximum value to the pfn calculated from dram size.
Merge tag 'pci-v5.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci into master
Pull PCI fixes from Bjorn Helgaas:
- Reject invalid IRQ 0 command line argument for virtio_mmio because
IRQ 0 now generates warnings (Bjorn Helgaas)
- Revert "PCI/PM: Assume ports without DLL Link Active train links in
100 ms", which broke nouveau (Bjorn Helgaas)
* tag 'pci-v5.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
Revert "PCI/PM: Assume ports without DLL Link Active train links in 100 ms"
virtio-mmio: Reject invalid IRQ 0 command line argument
Cong Wang [Fri, 24 Jul 2020 16:45:51 +0000 (09:45 -0700)]
qrtr: orphan socket in qrtr_release()
We have to detach sock from socket in qrtr_release(),
otherwise skb->sk may still reference to this socket
when the skb is released in tun->queue, particularly
sk->sk_wq still points to &sock->wq, which leads to
a UAF.
Reported-and-tested-by: syzbot+6720d64f31c081c2f708@syzkaller.appspotmail.com Fixes: 28fb4e59a47d ("net: qrtr: Expose tunneling endpoint to user space") Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Hai [Fri, 24 Jul 2020 13:46:30 +0000 (21:46 +0800)]
net: hix5hd2_gmac: Remove unneeded cast from memory allocation
Remove casting the values returned by memory allocation function.
Coccinelle emits WARNING:
./drivers/net/ethernet/hisilicon/hix5hd2_gmac.c:1027:9-23: WARNING:
casting value returned by memory allocation function to (struct sg_desc *) is useless.
This issue was detected by using the Coccinelle software.
Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
l2tp hasn't been kept up to date with the static analysis checks offered
by checkpatch.pl.
This patchset builds on the series: "l2tp: cleanup checkpatch.pl
warnings" and "l2tp: further checkpatch.pl cleanups" to resolve some of
the remaining checkpatch warnings in l2tp.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Fri, 24 Jul 2020 15:31:57 +0000 (16:31 +0100)]
l2tp: WARN_ON rather than BUG_ON in l2tp_session_free
l2tp_session_free called BUG_ON if the tunnel magic feather value wasn't
correct. The intent of this was to catch lifetime bugs; for example
early tunnel free due to incorrect use of reference counts.
Since the tunnel magic feather being wrong indicates either early free
or structure corruption, we can avoid doing more damage by simply
leaving the tunnel structure alone. If the tunnel refcount isn't
dropped when it should be, the tunnel instance will remain in the
kernel, resulting in the tunnel structure and socket leaking.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Fri, 24 Jul 2020 15:31:56 +0000 (16:31 +0100)]
l2tp: remove BUG_ON refcount value in l2tp_session_free
l2tp_session_free is only called by l2tp_session_dec_refcount when the
reference count reaches zero, so it's of limited value to validate the
reference count value in l2tp_session_free itself.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Fri, 24 Jul 2020 15:31:55 +0000 (16:31 +0100)]
l2tp: WARN_ON rather than BUG_ON in l2tp_session_queue_purge
l2tp_session_queue_purge is used during session shutdown to drop any
skbs queued for reordering purposes according to L2TP dataplane rules.
The BUG_ON in this function checks the session magic feather in an
attempt to catch lifetime bugs.
Rather than crashing the kernel with a BUG_ON, we can simply WARN_ON and
refuse to do anything more -- in the worst case this could result in a
leak. However this is highly unlikely given that the session purge only
occurs from codepaths which have obtained the session by means of a lookup
via. the parent tunnel and which check the session "dead" flag to
protect against shutdown races.
While we're here, have l2tp_session_queue_purge return void rather than
an integer, since neither of the callsites checked the return value.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Fri, 24 Jul 2020 15:31:54 +0000 (16:31 +0100)]
l2tp: don't BUG_ON seqfile checks in l2tp_ppp
checkpatch advises that WARN_ON and recovery code are preferred over
BUG_ON which crashes the kernel.
l2tp_ppp has a BUG_ON check of struct seq_file's private pointer in
pppol2tp_seq_start prior to accessing data through that pointer.
Rather than crashing, we can simply bail out early and return NULL in
order to terminate the seq file processing in much the same way as we do
when reaching the end of tunnel/session instances to render.
Retain a WARN_ON to help trace possible bugs in this area.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Fri, 24 Jul 2020 15:31:53 +0000 (16:31 +0100)]
l2tp: don't BUG_ON session magic checks in l2tp_ppp
checkpatch advises that WARN_ON and recovery code are preferred over
BUG_ON which crashes the kernel.
l2tp_ppp.c's BUG_ON checks of the l2tp session structure's "magic" field
occur in code paths where it's reasonably easy to recover:
* In the case of pppol2tp_sock_to_session, we can return NULL and the
caller will bail out appropriately. There is no change required to
any of the callsites of this function since they already handle
pppol2tp_sock_to_session returning NULL.
* In the case of pppol2tp_session_destruct we can just avoid
decrementing the reference count on the suspect session structure.
In the worst case scenario this results in a memory leak, which is
preferable to a crash.
Convert these uses of BUG_ON to WARN_ON accordingly.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Fri, 24 Jul 2020 15:31:52 +0000 (16:31 +0100)]
l2tp: remove BUG_ON in l2tp_tunnel_closeall
l2tp_tunnel_closeall is only called from l2tp_core.c, and it's easy
to statically analyse the code path calling it to validate that it
should never be passed a NULL tunnel pointer.
Having a BUG_ON checking the tunnel pointer triggers a checkpatch
warning. Since the BUG_ON is of no value, remove it to avoid the
warning.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Fri, 24 Jul 2020 15:31:51 +0000 (16:31 +0100)]
l2tp: remove BUG_ON in l2tp_session_queue_purge
l2tp_session_queue_purge is only called from l2tp_core.c, and it's easy
to statically analyse the code paths calling it to validate that it
should never be passed a NULL session pointer.
Having a BUG_ON checking the session pointer triggers a checkpatch
warning. Since the BUG_ON is of no value, remove it to avoid the
warning.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Fri, 24 Jul 2020 13:03:08 +0000 (09:03 -0400)]
icmp: revise rfc4884 tests
1) Only accept packets with original datagram len field >= header len.
The extension header must start after the original datagram headers.
The embedded datagram len field is compared against the 128B minimum
stipulated by RFC 4884. It is unlikely that headers extend beyond
this. But as we know the exact header length, check explicitly.
2) Remove the check that datagram length must be <= 576B.
This is a send constraint. There is no value in testing this on rx.
Within private networks it may be known safe to send larger packets.
Process these packets.
This test was also too lax. It compared original datagram length
rather than entire icmp packet length. The stand-alone fix would be:
- if (hlen + skb->len > 576)
+ if (-skb_network_offset(skb) + skb->len > 576)
Fixes: eba75c587e81 ("icmp: support rfc 4884") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Fri, 24 Jul 2020 13:09:19 +0000 (14:09 +0100)]
sctp: remove redundant initialization of variable status
The variable status is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed. Also put the variable declarations into
reverse christmas tree order.
Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrea Righi [Fri, 24 Jul 2020 08:59:10 +0000 (10:59 +0200)]
xen-netfront: fix potential deadlock in xennet_remove()
There's a potential race in xennet_remove(); this is what the driver is
doing upon unregistering a network device:
1. state = read bus state
2. if state is not "Closed":
3. request to set state to "Closing"
4. wait for state to be set to "Closing"
5. request to set state to "Closed"
6. wait for state to be set to "Closed"
If the state changes to "Closed" immediately after step 1 we are stuck
forever in step 4, because the state will never go back from "Closed" to
"Closing".
Make sure to check also for state == "Closed" in step 4 to prevent the
deadlock.
Also add a 5 sec timeout any time we wait for the bus state to change,
to avoid getting stuck forever in wait_event().
Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: openvswitch: fixes potential deadlock in dp cleanup code
The previous patch introduced a deadlock, this patch fixes it by making
sure the work is canceled without holding the global ovs lock. This is
done by moving the reorder processing one layer up to the netns level.
Fixes: eac87c413bf9 ("net: openvswitch: reorder masks array based on usage") Reported-by: syzbot+2c4ff3614695f75ce26c@syzkaller.appspotmail.com Reported-by: syzbot+bad6507e5db05017b008@syzkaller.appspotmail.com Reviewed-by: Paolo <pabeni@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
sctp: fix slab-out-of-bounds in SCTP_DELAYED_SACK processing
This sockopt accepts two kinds of parameters, using struct
sctp_sack_info and struct sctp_assoc_value. The mentioned commit didn't
notice an implicit cast from the smaller (latter) struct to the bigger
one (former) when copying the data from the user space, which now leads
to an attempt to write beyond the buffer (because it assumes the storing
buffer is bigger than the parameter itself).
Fix it by allocating a sctp_sack_info on stack and filling it out based
on the small struct for the compat case.
Changelog stole from an earlier patch from Marcelo Ricardo Leitner.
Fixes: ebb25defdc17 ("sctp: pass a kernel pointer to sctp_setsockopt_delayed_ack") Reported-by: syzbot+0e4699d000d8b874d8dc@syzkaller.appspotmail.com Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 24 Jul 2020 23:39:28 +0000 (16:39 -0700)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Tony Nguyen says:
====================
100GbE Intel Wired LAN Driver Updates 2020-07-23
This series contains updates to ice driver only.
Jake refactors ice_discover_caps() to reduce the number of AdminQ calls
made. Splits ice_parse_caps() to separate functions to update function
and device capabilities separately to allow for updating outside of
initialization.
Akeem adds power management support.
Paul G refactors FC and FEC code to aid in restoring of PHY settings
on media insertion. Implements lenient mode and link override support.
Adds link debug info and formats existing debug info to be more
readable. Adds support to check and report additional autoneg
capabilities. Implements the capability to detect media cage in order to
differentiate AUI types as Direct Attach or backplane.
Bruce implements Total Port Shutdown for devices that support it.
Lev renames low_power_ctrl field to lower_power_ctrl_an to be more
descriptive of the field.
Doug reports AOC types as media type fiber.
Paul S adds code to handle 1G SGMII PHY type.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
====================
get rid of the address_space override in setsockopt v2
setsockopt is the last place in architecture-independ code that still
uses set_fs to force the uaccess routines to operate on kernel pointers.
This series adds a new sockptr_t type that can contained either a kernel
or user pointer, and which has accessors that do the right thing, and
then uses it for setsockopt, starting by refactoring some low-level
helpers and moving them over to it before finally doing the main
setsockopt method.
Note that apparently the eBPF selftests do not even cover this path, so
the series has been tested with a testing patch that always copies the
data first and passes a kernel pointer. This is something that works for
most common sockopts (and is something that the ePBF support relies on),
but unfortunately in various corner cases we either don't use the passed
in length, or in one case actually copy data back from setsockopt, or in
case of bpfilter straight out do not work with kernel pointers at all.
Against net-next/master.
Changes since v1:
- check that users don't pass in kernel addresses
- more bpfilter cleanups
- cosmetic mptcp tweak
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
net: optimize the sockptr_t for unified kernel/user address spaces
For architectures like x86 and arm64 we don't need the separate bit to
indicate that a pointer is a kernel pointer as the address spaces are
unified. That way the sockptr_t can be reduced to a union of two
pointers, which leads to nicer calling conventions.
The only caveat is that we need to check that users don't pass in kernel
address and thus gain access to kernel memory. Thus the USER_SOCKPTR
helper is replaced with a init_user_sockptr function that does this check
and returns an error if it fails.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Rework the remaining setsockopt code to pass a sockptr_t instead of a
plain user pointer. This removes the last remaining set_fs(KERNEL_DS)
outside of architecture specific code.
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> [ieee802154] Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>