Collect descriptors of all ULD and LLD hardware queues managed
by LLD.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
cxgb4: when max_tx_rate is 0 disable tx rate limiting
in ndo_set_vf_rate() when max_tx_rate is 0 disable tx
rate limiting for that vf.
Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
In the previous implement, the query operation for fibre port and copper
port are mixed. This patch refines it by seperating them based on the port
type.
Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: hns3: Remove redundant codes of query advertised flow control abilitiy
If the advertised flow control ability has been fetched using
phy_ethtool_ksettings_get() or hclge_get_link_mode() then it is
unnecessary to fetch them again later using hclge_get_flowctrl_adv().
This patch removes it.
Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 3 Sep 2018 10:21:54 +0000 (11:21 +0100)]
net: hns3: Change the dst mac addr of loopback packet
Currently, the dst mac addr of loopback packet is the same as
the host' mac addr, the SSU component may loop back the packet
to host before the packet reaches mac or serdes, which will defect
the purpose of mac or serdes selftest.
This patch changes it by adding 0x1f to the last byte of dst mac
addr.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 3 Sep 2018 10:21:53 +0000 (11:21 +0100)]
net: hns3: Only update mac configuation when necessary
Currently only fiber port checks if it is necessay to set the
mac through firmware when link is changed, this patch unify the
checking to allow the copper port do the checking too.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 3 Sep 2018 10:21:52 +0000 (11:21 +0100)]
net: hns3: Preserve vlan 0 in hardware table
When netdev is down, the stack will delete the vlan from
hardware including vlan0, which will cause problem when
doing loopback selftest when netdev is down.
This patch fixes it by always preserving vlan 0 in hardware,
because vlan 0 is defalut vlan, which should always be in
hardware.
Fixes: c39c4d98dc65 ("net: hns3: Add mac loopback selftest support in hns3 driver") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 3 Sep 2018 10:21:51 +0000 (11:21 +0100)]
net: hns3: Fix ping exited problem when doing lp selftest
When ping is runnig and user executes the loopback selftest, the
ping cmd will stop and exit.
This patch fixes it by using the hns3_nic_net_open/stop to offline
the netdev when doing loopback selftest.
Fixes: c39c4d98dc65 ("net: hns3: Add mac loopback selftest support in hns3 driver") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 3 Sep 2018 10:21:50 +0000 (11:21 +0100)]
net: hns3: Fix for loopback selftest failed problem
Tqp and mac need to be enabled when doing loopback selftest,
ae_algo->ops->start/stop is used to do the job, there is a
time window between ae_algo->ops->start/stop and loopback setup,
which will cause selftest failed problem when there is frame
coming in during that time window.
This patch fixes it by enabling the tqp and mac during loopback
setup process.
Fixes: c39c4d98dc65 ("net: hns3: Add mac loopback selftest support in hns3 driver") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 3 Sep 2018 10:21:49 +0000 (11:21 +0100)]
net: hns3: Implement shutdown ops in hns3 pci driver
This patch implements shutdown ops in hns3 pci driver, which
unloads the hns3 driver and set the power state to D3hot.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
PF uses hdev->vlan_table to manage the port vlan table. In function
hclge_set_vlan_filter_hw(), it checks whether a vlan id has been used,
by foreach all the vport bits. It should use macro HCLGE_VPORT_NUM,
not VLAN_N_VID as the foreach condition.
Fixes: 6c251711b37f ("net: hns3: Disable vf vlan filter when vf vlan table is full") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Mon, 3 Sep 2018 10:21:47 +0000 (11:21 +0100)]
net: hns3: Fix for multicast failure
When the lower 24 bits of the IPV6 link-local addresses at both
ends are the same, the multicast MAC address for Neigbour Discovery
is the same. The multicast for Neigbour Discovery will fail.
This patch fixes it by including the bonding uplink port in the
multicast group.
Fixes: 46a3df9f9718("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 3 Sep 2018 10:21:46 +0000 (11:21 +0100)]
net: hns3: Fix for vf vlan delete failed problem
There are only 128 entries in vf vlan table, if user has added
more than 128 vlan, fw will ignore it and disable the vf vlan
table. So when user deletes the vlan entry that has not been
set to vf vlan table, fw will return not found result and driver
treat that as error, which will cause vlan delete failed problem.
This patch fixes it by returning ok when fw returns not found
result.
Fixes: 6c251711b37f ("net: hns3: Disable vf vlan filter when vf vlan table is full") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
tipc: correct structure parameter comments for topsrv
Remove the following obsolete parameter comments of tipc_topsrv struct:
@rcvbuf_cache
@tipc_conn_new
@tipc_conn_release
@tipc_conn_recvmsg
@imp
@type
Add the comments for the missing parameters below of tipc_topsrv struct:
@awork
@listener
Remove the unused or duplicated parameter comments of tipc_conn struct:
@outqueue_lock
@rx_action
Signed-off-by: Zhenbo Gao <zhenbo.gao@windriver.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
liquidio: Added delayed work for periodically updating the link statistics.
Signed-off-by: Pradeep Nalla <pradeep.nalla@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/mlx5e: IPoIB, Add ndo stats support for IPoIB netdevices
Expose RX and TX counters by implementing ndo_get_stats64 operation for
both parent devices.
After this change, all the relevant statistics can be retrieved using
ifconfig.
Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/mlx5e: IPoIB, Initialize max_opened_tc in mlx5i_init flow
Enhanced ipoib does not initialize max_opened_tc causing wrong ethtool
statistics. As mlx5e_grp_sw_update_stats relies on this variable, without
this change, the TX statistics will not be updated.
Fixes: 05909babce53 ("net/mlx5e: Avoid reset netdev stats on configuration changes") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 2 Sep 2018 23:16:24 +0000 (16:16 -0700)]
Merge branch 'Full-phylink-support-for-mv88e6352'
Andrew Lunn says:
====================
Full phylink support for mv88e6352
These two patches implement full phylink support for the mv88e6352
family, when using an SFP connected to its SERDES interface. This adds
interrupt support to the SERDES, so that we get interrupts on link
up/down, and then make calls phydev_link_change().
The first patch is a minor bug fix, which does not seem to affect any
current features, so i'm not submitting it for stable. It is however
required for configuring SERDES interrupts.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 2 Sep 2018 16:13:15 +0000 (18:13 +0200)]
net: dsa: mv88e6xxx: Add SERDES phydev_link_change for 6352
The 6352 family has one SERDES interface, which can be used by either
port 4 or port 5. Add interrupt support for the SERDES interface, and
report when the link status changes.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 2 Sep 2018 16:13:14 +0000 (18:13 +0200)]
net: dsa: mv88e6xxx: Fix writing to a PHY page.
After changing to the needed page, actually write the value to the
register!
Fixes: 09cb7dfd3f14 ("net: dsa: mv88e6xxx: describe PHY page and SerDes") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
tcp: remove useless add operation when init sysctl_max_tw_buckets
cp_hashinfo.ehash_mask is always an odd number, which is set in function
alloc_large_system_hash(). See bellow,
if (_hash_mask)
*_hash_mask = (1 << log2qty) - 1; <<< always odd number
Hence the local variable 'cnt' is a even number, as a result of that it is
no difference to do the incrementation here.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Moritz Fischer [Fri, 31 Aug 2018 20:30:54 +0000 (13:30 -0700)]
net: nixge: Fix Kconfig warning with OF_MDIO
Fix Kconfig warning with OF_MDIO where OF_MDIO was
selected unconditionally instead of only when
OF is actually enabled.
Fixes 7e8d5755be0e ("net: nixge: Add support for 64-bit platforms") Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Moritz Fischer <mdf@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 2 Sep 2018 21:13:31 +0000 (14:13 -0700)]
Merge branch 'mvneta-some-small-improvements'
Jisheng Zhang says:
====================
net: mvneta: some small improvements
patch1 removes the NETIF_F_GRO check ourself, because the net subsystem
will handle it for us.
patch2 enables NETIF_F_RXCSUM by default, since the driver and HW
supports the feature.
patch3 is a small optimization, to reduce smp_processor_id() calling
in mvneta_tx_done_gbe.
since v1:
- based on net-next tree
- remove the fix patches, since they should be based on net branch.
- Add Gregory's Reviewed-by tag
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
After the patch, the loop looks like(under arm64):
...
bl 0 <_raw_spin_lock>
str w23, [x28,#132]
...
where w23 is loaded so be ready before the loop.
>From another side, mvneta_tx_done_gbe() is called from mvneta_poll()
which is in non-preemptible context, so it's safe to call the
smp_processor_id() function once.
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Reviewed-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jisheng Zhang [Fri, 31 Aug 2018 08:10:03 +0000 (16:10 +0800)]
net: mvneta: enable NETIF_F_RXCSUM by default
The code and HW supports NETIF_F_RXCSUM, so let's enable it by default.
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Reviewed-by: Gregory CLEMENT <gregory.clement@bootlin.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Jisheng Zhang [Fri, 31 Aug 2018 08:09:13 +0000 (16:09 +0800)]
net: mvneta: Don't check NETIF_F_GRO ourself
napi_gro_receive() checks NETIF_F_GRO bit as well, if the bit is not
set, we will go through GRO_NORMAL in napi_skb_finish(), so fall back
to netif_receive_skb_internal(), so we don't need to check NETIF_F_GRO
ourself.
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Reviewed-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: bgmac: remove set but not used variable 'err'
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/ethernet/broadcom/bgmac.c: In function 'bgmac_dma_alloc':
drivers/net/ethernet/broadcom/bgmac.c:619:6: warning:
variable 'err' set but not used [-Wunused-but-set-variable]
Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 31 Aug 2018 19:29:49 +0000 (12:29 -0700)]
net: dsa: b53: Provide sensible defaults
The SRAB driver is the default way to communicate with the integrated
switch on iProc platforms and the MMAP driver is the way to communicate
with the integrated switch on DSL BCM63xx and CM BCM33xx.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
YueHaibing [Fri, 31 Aug 2018 12:03:56 +0000 (12:03 +0000)]
liquidio: remove set but not used variable 'irh'
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/ethernet/cavium/liquidio/request_manager.c: In function 'lio_process_iq_request_list':
drivers/net/ethernet/cavium/liquidio/request_manager.c:383:27: warning:
variable 'irh' set but not used [-Wunused-but-set-variable]
Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Driver displays an error message for each unrecognized dcbx TLV that's
received from the peer or configured on the device. It is observed that
syslog will be flooded with such messages in certain scenarios e.g.,
frequent link-flaps/lldp-transactions. Changing the severity of this
message to verbose level as it's not an error scenario/message.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Wed, 29 Aug 2018 15:47:19 +0000 (17:47 +0200)]
rds: store socket timestamps as ktime_t
rds is the last in-kernel user of the old do_gettimeofday()
function. Convert it over to ktime_get_real() to make it
work more like the generic socket timestamps, and to let
us kill off do_gettimeofday().
A follow-up patch will have to change the user space interface
to deal better with 32-bit tasks, which may use an incompatible
layout for 'struct timespec'.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vakul Garg [Wed, 29 Aug 2018 09:56:55 +0000 (15:26 +0530)]
net/tls: Add support for async decryption of tls records
When tls records are decrypted using asynchronous acclerators such as
NXP CAAM engine, the crypto apis return -EINPROGRESS. Presently, on
getting -EINPROGRESS, the tls record processing stops till the time the
crypto accelerator finishes off and returns the result. This incurs a
context switch and is not an efficient way of accessing the crypto
accelerators. Crypto accelerators work efficient when they are queued
with multiple crypto jobs without having to wait for the previous ones
to complete.
The patch submits multiple crypto requests without having to wait for
for previous ones to complete. This has been implemented for records
which are decrypted in zero-copy mode. At the end of recvmsg(), we wait
for all the asynchronous decryption requests to complete.
The references to records which have been sent for async decryption are
dropped. For cases where record decryption is not possible in zero-copy
mode, asynchronous decryption is not used and we wait for decryption
crypto api to complete.
For crypto requests executing in async fashion, the memory for
aead_request, sglists and skb etc is freed from the decryption
completion handler. The decryption completion handler wakesup the
sleeping user context when recvmsg() flags that it has done sending
all the decryption requests and there are no more decryption requests
pending to be completed.
Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Reviewed-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
YueHaibing [Fri, 31 Aug 2018 04:08:01 +0000 (04:08 +0000)]
bnxt_en: remove set but not used variable 'rx_stats'
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c: In function 'bnxt_vf_rep_rx':
drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c:212:28: warning:
variable 'rx_stats' set but not used [-Wunused-but-set-variable]
struct bnxt_vf_rep_stats *rx_stats;
Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jose Abreu [Thu, 30 Aug 2018 14:09:48 +0000 (15:09 +0100)]
net: stmmac: Add CBS support in XGMAC2
XGMAC2 uses the same CBS mechanism as GMAC5, only registers offset
changes. Lets use the same TC callbacks and implement the .config_cbs
callback in XGMAC2 core.
Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Joao Pinto <jpinto@synopsys.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The Freescale/NXP DPAA2 Ethernet driver was first included in
drivers/staging, due to its dependencies on two components located
there at the time of its initial submission:
* the fsl-mc bus driver, which was moved to drivers/bus in kernel 4.17
* the dpio driver, which was moved to drivers/soc/fsl in kernel 4.18
More information on the DPAA2 architecture and the interactions
between the fsl-mc bus and the objects present on it can be found in:
Documentation/networking/dpaa2/overview.rst
For easier review, the patch is generated without the -M option,
although the driver files are moved without any code changes.
changes since v1[1]:
* remove RFC label, since dependencies have been merged on net-next
* add patch fixing a possible race at probe (reported by Andrew Lunn)
Ioana Radulescu [Wed, 29 Aug 2018 09:42:40 +0000 (04:42 -0500)]
dpaa2-eth: Move DPAA2 Ethernet driver from staging to drivers/net
The DPAA2 Ethernet driver supports Freescale/NXP SoCs with DPAA2
(DataPath Acceleration Architecture v2). The driver manages
network objects discovered on the fsl-mc bus.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Only call netdev_register() at the end of the probe function,
once all other necessary bits and pieces are properly initialized.
We keep the rest of the netdevice initialization code in place,
at the earlier point of the probing sequence, including the
settings previously done in ndo_init.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuchung Cheng [Wed, 29 Aug 2018 21:53:56 +0000 (14:53 -0700)]
tcp: change IPv6 flow-label upon receiving spurious retransmission
Currently a Linux IPv6 TCP sender will change the flow label upon
timeouts to potentially steer away from a data path that has gone
bad. However this does not help if the problem is on the ACK path
and the data path is healthy. In this case the receiver is likely
to receive repeated spurious retransmission because the sender
couldn't get the ACKs in time and has recurring timeouts.
This patch adds another feature to mitigate this problem. It
leverages the DSACK states in the receiver to change the flow
label of the ACKs to speculatively re-route the ACK packets.
In order to allow triggering on the second consecutive spurious
RTO, the receiver changes the flow label upon sending a second
consecutive DSACK for a sequence number below RCV.NXT.
Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Wed, 29 Aug 2018 17:15:36 +0000 (10:15 -0700)]
net_sched: add missing tcf_lock for act_connmark
According to the new locking rule, we have to take tcf_lock
for both ->init() and ->dump(), as RTNL will be removed.
However, it is missing for act_connmark.
Cc: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Add AF_XDP zero-copy support for i40e driver (!), from Björn and Magnus.
2) BPF verifier improvements by giving each register its own liveness
chain which allows to simplify and getting rid of skip_callee() logic,
from Edward.
3) Add bpf fs pretty print support for percpu arraymap, percpu hashmap
and percpu lru hashmap. Also add generic percpu formatted print on
bpftool so the same can be dumped there, from Yonghong.
4) Add bpf_{set,get}sockopt() helper support for TCP_SAVE_SYN and
TCP_SAVED_SYN options to allow reflection of tos/tclass from received
SYN packet, from Nikita.
5) Misc improvements to the BPF sockmap test cases in terms of cgroup v2
interaction and removal of incorrect shutdown() calls, from John.
6) Few cleanups in xdp_umem_assign_dev() and xdpsock samples, from Prashant.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Magnus Karlsson [Fri, 31 Aug 2018 11:40:02 +0000 (13:40 +0200)]
xsk: i40e: get rid of useless struct xdp_umem_props
This commit gets rid of the structure xdp_umem_props. It was there to
be able to break a dependency at one point, but this is no longer
needed. The values in the struct are instead stored directly in the
xdp_umem structure. This simplifies the xsk code as well as af_xdp
zero-copy drivers and as a bonus gets rid of one internal header file.
The i40e driver is also adapted to the new interface in this commit.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Magnus Karlsson [Fri, 31 Aug 2018 11:40:01 +0000 (13:40 +0200)]
i40e: fix possible compiler warning in xsk TX path
With certain gcc versions, it was possible to get the warning
"'tx_desc' may be used uninitialized in this function" for the
i40e_xmit_zc. This was not possible, however this commit simplifies
the code path so that this warning is no longer emitted.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Prashant Bhole [Fri, 31 Aug 2018 01:00:49 +0000 (10:00 +0900)]
samples/bpf: xdpsock, minor fixes
- xsks_map size was fixed to 4, changed it MAX_SOCKS
- Remove redundant definition of MAX_SOCKS in xdpsock_user.c
- In dump_stats(), add NULL check for xsks[i]
Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
bpf: add TCP_SAVE_SYN/TCP_SAVED_SYN sample program
Sample program which shows TCP_SAVE_SYN/TCP_SAVED_SYN usage example:
bpf program which is doing TOS/TCLASS reflection (server would reply
with a same TOS/TCLASS as client).
Signed-off-by: Nikita V. Shirokov <tehnerd@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
bpf: add TCP_SAVE_SYN/TCP_SAVED_SYN options for bpf_(set|get)sockopt
Adding support for two new bpf get/set sockopts: TCP_SAVE_SYN (set)
and TCP_SAVED_SYN (get). This would allow for bpf program to build
logic based on data from ingress SYN packet (e.g. doing tcp's tos/
tclass reflection (see sample prog)) and do it transparently from
userspace program point of view.
Signed-off-by: Nikita V. Shirokov <tehnerd@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Colin Ian King [Thu, 30 Aug 2018 14:27:18 +0000 (15:27 +0100)]
xdp: remove redundant variable 'headroom'
Variable 'headroom' is being assigned but is never used hence it is
redundant and can be removed.
Cleans up clang warning:
variable ‘headroom’ set but not used [-Wunused-but-set-variable]
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
David S. Miller [Fri, 31 Aug 2018 21:08:34 +0000 (14:08 -0700)]
Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
40GbE Intel Wired LAN Driver Updates 2018-08-30
This series contains updates to i40e, i40evf and virtchnl.
Jake implements helper functions to use an array to handle the queue
stats which reduces the boiler plate code as well as keep the complexity
localized to a few functions.
Paweł adds the ability to change a VF's MAC address from the host side
without having to reload the VF driver on the guest side.
Paul adds a check to ensure that the number of queues that the PF sends
to the VF is equal to or less than the maximum number of queues the VF
can support.
Mitch fixes an issue caught by GCC 8, where we need to not include the
terminating null in the length of the string for strncpy().
Lihong fixes a VF issue to ensure that it does not enter into
promiscuous mode when macvlan is added to the VF. Fixed a potential
crash after a VF is removed, since the workqueue sync for the adminq
task was not being cancelled.
Harshitha fixes the type for field_flags in the virtchnl_filter struct.
Martyna removes an unnecessary check in a conditional if statement.
Björn fixes an issue reported by Jesper Dangaard Brouer, where the
driver was reporting incorrect statistics when XDP was enabled.
Jan fixes the potential reporting of incorrect speed settings.
Patryk fixed an issue where the flag
I40EVF_FLAG_AQ_ENABLE_VLAN_STRIPPING was getting set when any offload is
set via ethtool. Resolved by only setting this flag when VLAN offload
is enabled. Also ensure we hold the rtnl lock when we are clearing the
interrupt scheme. Added a check when deleting the MAC address from the
VF to ensure that the MAC address was not set by the PF and if it was,
do not delete it.
v2: updated patch 2 in the series based on community feedback from David
Miller to inline a function
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Lihong Yang [Tue, 28 Aug 2018 17:16:08 +0000 (10:16 -0700)]
i40evf: cancel workqueue sync for adminq when a VF is removed
If a VF is being removed, there is no need to continue with the
workqueue sync for the adminq task, thus cancel it. Without this call,
when VFs are created and removed right away, there might be a chance for
the driver to crash with events stuck in the adminq.
Signed-off-by: Lihong Yang <lihong.yang@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Patryk Małek [Tue, 28 Aug 2018 17:16:02 +0000 (10:16 -0700)]
i40evf: Don't enable vlan stripping when rx offload is turned on
With current implementation of i40evf_set_features when user sets
any offload via ethtool we set I40EVF_FLAG_AQ_ENABLE_VLAN_STRIPPING
as a required aq which triggers driver to call
i40evf_enable_vlan_stripping. This shouldn't take place.
This patches fixes it by setting the flag only when VLAN offload
is turned on.
Signed-off-by: Patryk Małek <patryk.malek@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jan Sokolowski [Tue, 28 Aug 2018 17:16:01 +0000 (10:16 -0700)]
i40e: Check and correct speed values for link on open
If our card has been put in an unstable state due to
other drivers interacting with it, speed settings
might be incorrect. If incorrect, forcefully reset them
on open to known default values.
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Björn Töpel [Fri, 24 Aug 2018 11:21:59 +0000 (13:21 +0200)]
i40e: report correct statistics when XDP is enabled
When XDP is enabled, the driver will report incorrect
statistics. Received frames will reported as transmitted frames.
This commits fixes the i40e implementation of ndo_get_stats64 (struct
net_device_ops), so that iproute2 will report correct statistics
(e.g. when running "ip -stats link show dev eth0") even when XDP is
enabled.
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Fixes: 74608d17fe29 ("i40e: add support for XDP_TX action") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
virtchnl: use u8 type for a field in the virtchnl_filter struct
The virtchnl_filter struct has a field called field_flags. A previous
commit mistakenly had the type to be a __u8. What we want is for the
field to be an unsigned 8 bit value, so let's just use the existing
kernel type u8 for that.
Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Lihong Yang [Mon, 20 Aug 2018 15:12:31 +0000 (08:12 -0700)]
i40evf: set IFF_UNICAST_FLT flag for the VF
Set IFF_UNICAST_FLT flag for the VF to prevent it from entering
promiscuous mode when macvlan is added to the VF.
Signed-off-by: Lihong Yang <lihong.yang@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Mon, 20 Aug 2018 15:12:30 +0000 (08:12 -0700)]
i40e: use correct length for strncpy
Caught by GCC 8. When we provide a length for strncpy, we should not
include the terminating null. So we must tell it one less than the size
of the destination buffer.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A PF can send any number of queues to the VF and the VF may not
be able to support that many. Check to see that the number of
queues is less than or equal to the max number of queues the
VF can have.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Paweł Jabłoński [Mon, 20 Aug 2018 15:12:26 +0000 (08:12 -0700)]
i40evf: Change a VF mac without reloading the VF driver
Add possibility to change a VF mac address from host side
without reloading the VF driver on the guest side. Without
this patch it is not possible to change the VF mac because
executing i40evf_virtchnl_completion function with
VIRTCHNL_OP_GET_VF_RESOURCES opcode resets the VF mac
address to previous value.
Signed-off-by: Paweł Jabłoński <pawel.jablonski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Mon, 20 Aug 2018 15:12:25 +0000 (08:12 -0700)]
i40evf: update ethtool stats code and use helper functions
Fix a bug in the way we handled VF queues, by always showing stats for
the maximum number of queues, even if they aren't allocated. It is not
safe to change the number of strings reported to ethtool, as grabbing
statistics occurs over multiple ethtool ops for which the rtnl_lock()
cannot be held the entire time.
Avoid this by always reporting queue stats for the maximum number of
queues in the netdevice. Share some of the helper functionality for
adding stats with the PF code in i40e_ethtool_stats.h
This should reduce the chance of potential future bugs, and make adding
new statistics easier.
Note for the queue stats, unlike the PF driver we do not keep an array
of queue pointers, but an array of queues, so care must be taken to
avoid accessing queue memory that hasn't yet been allocated.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Mon, 20 Aug 2018 15:12:24 +0000 (08:12 -0700)]
i40e: move ethtool stats boiler plate code to i40e_ethtool_stats.h
Move the boiler plate structures and helper functions we recently
added into their own header file, so that the complete collection is
located together.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Mon, 20 Aug 2018 15:12:23 +0000 (08:12 -0700)]
i40e: convert queue stats to i40e_stats array
Use an i40e_stats array to handle the queue stats, instead of coding
similar functionality separately. Because of how the queue stats are
accessed on some kernels, we can't easily use i40e_add_ethtool_stats.
Instead, implement a separate helper, i40e_add_queue_stats, which we'll
use instead. This helper will correctly implement the
u64_stats_fetch_begin_irq logic and allow retries until successful. We
share the most complex code by re-using i40e_add_one_ethtool_stat.
This logic additionally easily supports skipping disabled rings by using
a ternary operator before calling the u64_stats_fetch_begin_irq()
function, so that we correctly zero-out the stats values without having
to perform two separate sections of code.
This significantly reduces the boiler plate code in
i40e_get_ethtool_stats, and helps keep the complex logic contained to as
few functions as possible.
With this patch, we've finally converted all the statistics to use the
helpers and the i40e_stats function.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Daniel Borkmann [Thu, 30 Aug 2018 12:03:54 +0000 (14:03 +0200)]
Merge branch 'bpf-bpffs-bpftool-dump-with-btf'
Yonghong Song says:
====================
Commit a26ca7c982cb ("bpf: btf: Add pretty print support to the
basic arraymap") and Commit 699c86d6ec21 ("bpf: btf: add pretty print
for hash/lru_hash maps") added bpffs pretty print for array, hash and
lru hash maps. The pretty print gives users a structurally formatted
dump for keys/values which much easy to understand than raw bytes.
This patch set implemented bpffs pretty print support for
percpu arraymap, percpu hashmap and percpu lru hashmap.
For complex key/value types, the pretty print here is even more useful
due to:
. large volumne of data making it even harder to correlate bytes
to a particular field in a particular cpu.
. kernel rounds the value size for each cpu to multiple of 8.
User has to be aware of this otherwise wrong value may be
derived from cpu 1/2/...
For example, we may have a bpffs pretty print like below:
43602: {
cpu0: {43602,0,-43602,0x3,0xaa52,0x3,{43602|[82,170,0,0,0,0,0,0]},ENUM_TWO}
cpu1: {43602,0,-43602,0x3,0xaa52,0x3,{43602|[82,170,0,0,0,0,0,0]},ENUM_TWO}
cpu2: {43602,0,-43602,0x3,0xaa52,0x3,{43602|[82,170,0,0,0,0,0,0]},ENUM_TWO}
cpu3: {43602,0,-43602,0x3,0xaa52,0x3,{43602|[82,170,0,0,0,0,0,0]},ENUM_TWO}
}
for a percpu map.
This patch also added percpu formatted print on bpftool. For example,
bpftool may print like below:
{
"key": 0,
"values": [{
"cpu": 0,
"value": {
"ui32": 0,
"ui16": 0,
}
},{
"cpu": 1,
"value": {
"ui32": 1,
"ui16": 0,
}
},{
"cpu": 2,
"value": {
"ui32": 2,
"ui16": 0,
}
},{
"cpu": 3,
"value": {
"ui32": 3,
"ui16": 0,
}
}
]
}
Patch #1 implemented bpffs pretty print for percpu arraymap/hash/lru_hash
in kernel. Patch #2 added the test case in tools bpf selftest test_btf.
Patch #3 added percpu map btf based dump.
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The btf pretty print is added to percpu arraymap,
percpu hashmap and percpu lru hashmap.
For each <key, value> pair, the following will be
added to plain/json output:
David S. Miller [Thu, 30 Aug 2018 05:13:47 +0000 (22:13 -0700)]
Merge tag 'mac80211-next-for-davem-2018-08-29' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
Only a few changes at this point:
* new channels in 60 GHz
* clarify (average) ACK signal reporting API
* expose ieee80211_send_layer2_update() for all drivers
* start/stop mac80211's TXQs properly when required
* avoid regulatory restore with IE ignoring
* spelling: contidion -> condition
* fully implement WFA Multi-AP backhaul
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Wed, 29 Aug 2018 03:52:10 +0000 (11:52 +0800)]
vxlan: reduce dirty cache line in vxlan_find_mac
vxlan_find_mac() unconditionally set f->used for every packet,
this causes a cache miss for every packet, since remote, hlist
and used of vxlan_fdb share the same cache line, which are
accessed when send every packets.
so f->used is set only if not equal to jiffies, to reduce dirty
cache line times, this gives 3% speed-up with small packets.
Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Change soft command handling to fix the possible race condition when the
process handles a response of a soft command that was already freed by an
application which got timeout for this request.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Felix Manlunas [Wed, 29 Aug 2018 01:51:44 +0000 (18:51 -0700)]
liquidio: remove obsolete functions and data structures
1. Remove unused functions and data structures.
2. Change the sending of the remaining soft commands to synchronous.
Signed-off-by: Weilin Chang <weilin.chang@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Felix Manlunas [Wed, 29 Aug 2018 01:51:40 +0000 (18:51 -0700)]
liquidio: change octnic_ctrl_pkt to do synchronous soft commands
1. Change struct octnic_ctrl_pkt to support synchronous operation.
2. Change code which use structure octnic_ctrl_pkt to send sc's
synchronously.
Signed-off-by: Weilin Chang <weilin.chang@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Felix Manlunas [Wed, 29 Aug 2018 01:51:35 +0000 (18:51 -0700)]
liquidio: make soft command calls synchronous
1. Add wait_for_sc_completion_timeout() for waiting the response and
handling common response errors
2. Send sc's synchronously: remove unused callback function,
and context structure; use wait_for_sc_completion_timeout() to wait
its response.
Signed-off-by: Weilin Chang <weilin.chang@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Felix Manlunas [Wed, 29 Aug 2018 01:51:30 +0000 (18:51 -0700)]
liquidio: improve soft command handling
1. Set LIO_SC_MAX_TMO_MS as the maximum timeout value for a soft command
(sc). All sc's use this value as a hard timeout value. Add expiry_time
in struct octeon_soft_command to keep the hard timeout value. The field
wait_time and timeout in struct octeon_soft_command will be obsoleted in
the last patch of this patch series.
2. Add processing a synchronous sc in sc response thread
lio_process_ordered_list. The memory allocated for a synchronous sc will
be freed by lio_process_ordered_list() to the sc pool.
3. Add two response lists for lio_process_ordered_list to process the
storage allocated for sc's:
OCTEON_DONE_SC_LIST response list keeps all sc's which will be freed to
the pool after their requestors have finished processing the responses.
OCTEON_ZOMBIE_SC_LIST response list keeps all sc's which have got
LIO_SC_MAX_TMO_MS timeout.
When an sc gets a hard timeout, lio_process_order_list() will recheck
its status 1 ms later. If the status has not updated by the firmware at
that time, the sc will be removed from OCTEON_DONE_SC_LIST response list
to OCTEON_ZOMBIE_SC_LIST response list. The sc's in the
OCTEON_ZOMBIE_SC_LIST response list will be freed when the driver is
unloaded.
Signed-off-by: Weilin Chang <weilin.chang@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/tls: Calculate nsg for zerocopy path without skb_cow_data.
decrypt_skb fails if the number of sg elements required to map it
is greater than MAX_SKB_FRAGS. nsg must always be calculated, but
skb_cow_data adds unnecessary memcpy's for the zerocopy case.
The new function skb_nsg calculates the number of scatterlist elements
required to map the skb without the extra overhead of skb_cow_data.
This patch reduces memcpy by 50% on my encrypted NBD benchmarks.
Reported-by: Vakul Garg <Vakul.garg@nxp.com> Reviewed-by: Vakul Garg <Vakul.garg@nxp.com> Tested-by: Vakul Garg <Vakul.garg@nxp.com> Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Moritz Fischer [Tue, 28 Aug 2018 22:16:31 +0000 (15:16 -0700)]
net: nixge: Add support for 64-bit platforms
Add support for 64-bit platforms to driver.
The hardware only supports 32-bit register accesses
so the accesses need to be split up into two writes
when setting the current and tail descriptor values.
Signed-off-by: Moritz Fischer <mdf@kernel.org> Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Oskolkov [Tue, 28 Aug 2018 18:36:20 +0000 (11:36 -0700)]
selftests/net: add ip_defrag selftest
This test creates a raw IPv4 socket, fragments a largish UDP
datagram and sends the fragments out of order.
Then repeats in a loop with different message and fragment lengths.
Then does the same with overlapping fragments (with overlapping
fragments the expectation is that the recv times out).
Tested:
root@<host># time ./ip_defrag.sh
ipv4 defrag
PASS
ipv4 defrag with overlaps
PASS
real 1m7.679s
user 0m0.628s
sys 0m2.242s
A similar test for IPv6 is to follow.
Signed-off-by: Peter Oskolkov <posk@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Oskolkov [Tue, 28 Aug 2018 18:36:19 +0000 (11:36 -0700)]
ip: fail fast on IP defrag errors
The current behavior of IP defragmentation is inconsistent:
- some overlapping/wrong length fragments are dropped without
affecting the queue;
- most overlapping fragments cause the whole frag queue to be dropped.
This patch brings consistency: if a bad fragment is detected,
the whole frag queue is dropped. Two major benefits:
- fail fast: corrupted frag queues are cleared immediately, instead of
by timeout;
- testing of overlapping fragments is now much easier: any kind of
random fragment length mutation now leads to the frag queue being
discarded (IP packet dropped); before this patch, some overlaps were
"corrected", with tests not seeing expected packet drops.
Note that in one case (see "if (end&7)" conditional) the current
behavior is preserved as there are concerns that this could be
legitimate padding.
Signed-off-by: Peter Oskolkov <posk@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rick Farrington [Tue, 28 Aug 2018 18:32:55 +0000 (11:32 -0700)]
liquidio: fix race condition in instruction completion processing
In lio_enable_irq, the pkt_in_done count register was being cleared to
zero. However, there could be some completed instructions which were not
yet processed due to budget and limit constraints.
So, only write this register with the number of actual completions
that were processed.
Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rick Farrington [Tue, 28 Aug 2018 18:19:54 +0000 (11:19 -0700)]
liquidio: remove unnecessary delay when processing IQ responses
Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>