]> git.proxmox.com Git - mirror_ubuntu-jammy-kernel.git/log
mirror_ubuntu-jammy-kernel.git
4 years agoIB/hfi1: Add packet histogram trace event
Grzegorz Andrejczuk [Mon, 11 May 2020 16:07:01 +0000 (12:07 -0400)]
IB/hfi1: Add packet histogram trace event

Add a simple trace event taking context number and building simple
histogram to print packets distribution between contexts.

Link: https://lore.kernel.org/r/20200511160700.173205.84270.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/{hfi1, ipoib, rdma}: Broadcast ping sent packets which exceeded mtu size
Gary Leshner [Mon, 11 May 2020 16:06:55 +0000 (12:06 -0400)]
IB/{hfi1, ipoib, rdma}: Broadcast ping sent packets which exceeded mtu size

When in connected mode ipoib sent broadcast pings which exceeded the mtu
size for broadcast addresses.

Add an mtu attribute to the rdma_netdev structure which ipoib sets to its
mcast mtu size.

The RDMA netdev uses this value to determine if the skb length is too long
for the mtu specified and if it is, drops the packet and logs an error
about the errant packet.

Link: https://lore.kernel.org/r/20200511160655.173205.14546.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Activate the dummy netdev
Grzegorz Andrejczuk [Mon, 11 May 2020 16:06:49 +0000 (12:06 -0400)]
IB/hfi1: Activate the dummy netdev

As described in earlier patches, ipoib netdev will share receive
contexts with existing VNIC netdev through a dummy netdev. The
following changes are made to achieve that:
- Set up netdev receive contexts after user contexts. A function is
  added to count the available netdev receive contexts.
- Add functions to set/get receive map table free index.
- Rename NUM_VNIC_MAP_ENTRIES as NUM_NETDEV_MAP_ENTRIES.
- Let the dummy netdev own the receive contexts instead of VNIC.
- Allocate the dummy netdev when the hfi1 device is added and free it
  when the device is removed.
- Initialize AIP RSM rules when the IpoIb rxq is initialized and
  remove the rules when it is de-initialized.
- Convert VNIC to use the dummy netdev.

Link: https://lore.kernel.org/r/20200511160649.173205.4626.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com>
Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Add rx functions for dummy netdev
Grzegorz Andrejczuk [Mon, 11 May 2020 16:06:43 +0000 (12:06 -0400)]
IB/hfi1: Add rx functions for dummy netdev

This patch adds the rx functions for the dummy netdev:
- Functions to allocate/free the dummy netdev.
- Functions to allocate/free receiving contexts for the netdev.
- Functions to initialize/de-initialize the receive queue.
- Functions to enable/disable the receive queue.

Link: https://lore.kernel.org/r/20200511160643.173205.75087.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com>
Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Add interrupt handler functions for accelerated ipoib
Grzegorz Andrejczuk [Mon, 11 May 2020 16:06:37 +0000 (12:06 -0400)]
IB/hfi1: Add interrupt handler functions for accelerated ipoib

This patch adds the interrupt handler function, the NAPI poll
function, and its associated helper functions for receiving
accelerated ipoib packets. While we are here, fix the formats
of two error printouts.

Link: https://lore.kernel.org/r/20200511160637.173205.64890.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com>
Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Add functions to receive accelerated ipoib packets
Kaike Wan [Mon, 11 May 2020 16:06:31 +0000 (12:06 -0400)]
IB/hfi1: Add functions to receive accelerated ipoib packets

Ipoib netdev will share receive contexts with existing VNIC netdev.
To achieve that, a dummy netdev is allocated with hfi1_devdata to
own the receive contexts, and ipoib and VNIC netdevs will be put
on top of it. Each receive context is associated with a single
NAPI object.

This patch adds the functions to receive incoming packets for
accelerated ipoib.

Link: https://lore.kernel.org/r/20200511160631.173205.54184.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com>
Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Rename num_vnic_contexts as num_netdev_contexts
Grzegorz Andrejczuk [Mon, 11 May 2020 16:06:25 +0000 (12:06 -0400)]
IB/hfi1: Rename num_vnic_contexts as num_netdev_contexts

Rename num_vnic_contexts as num_ndetdev_contexts since VNIC and ipoib
will share the same set of receive contexts.

Link: https://lore.kernel.org/r/20200511160625.173205.53306.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com>
Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/ipoib: Increase ipoib Datagram mode MTU's upper limit
Kaike Wan [Mon, 11 May 2020 16:06:18 +0000 (12:06 -0400)]
IB/ipoib: Increase ipoib Datagram mode MTU's upper limit

Currently the ipoib UD mtu is restricted to 4K bytes. Remove this
limitation so that the IPOIB module can potentially use an MTU (in UD
mode) that is bounded by the MTU of the underlying device. A field is
added to the ib_port_attr structure to indicate the maximum physical
MTU the underlying device supports.

Link: https://lore.kernel.org/r/20200511160618.173205.23053.stgit@awfm-01.aw.intel.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: RSM rules for AIP
Grzegorz Andrejczuk [Mon, 11 May 2020 16:06:12 +0000 (12:06 -0400)]
IB/hfi1: RSM rules for AIP

This is implementation of RSM rule for AIP packets.
AIP rule will use rule RSM2 and will match standard
Infiniband packet containg BTH (LNH==BTH) and
having Dest QPN prefixed with value 0x81. Spread between
receive contexts will be done using source QPN bits.

VNIC and AIP will share receive contexts, so their rules
will point to the same RMT entries and their shared
code is moved to separate functions.
If any of the rules is active RMT mapping will be skipped
for latter.

Changed function hfi1_vnic_is_rsm_full to be more general
and moved it from main header to chip.c.

Changed the order of RSM rules because AIP rule as
more specific one is needed to be placed before more
general QOS rule. Rules are occupying two last RSM
registers.

Link: https://lore.kernel.org/r/20200511160612.173205.73002.stgit@awfm-01.aw.intel.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/{rdmavt, hfi1}: Implement creation of accelerated UD QPs
Gary Leshner [Mon, 11 May 2020 16:06:07 +0000 (12:06 -0400)]
IB/{rdmavt, hfi1}: Implement creation of accelerated UD QPs

Adds capability to create a qpn to be recognized as an accelerated
UD QP for ipoib.

This is accomplished by reserving 0x81 in byte[0] of the qpn as the
prefix for these qp types and reserving qpns between 0x810000 and
0x81ffff.

The hfi1 capability mask already contained a flag for the VNIC netdev.
This has been renamed and extended to include both VNIC and ipoib.

The rvt code to allocate qps now recognizes this flag and sets 0x81
into byte[0] of the qpn.

The code to allocate qpns is modified to reset the qpn numbering when it
is detected that a value is located in byte[0] for a UD QP and it is a
qpn being requested for net dev use. If it is a regular UD QP then it is
allowable to have bits set in byte[0] of the qpn and provide the
previously normal behavior.

The code to free the qpn now checks for the AIP prefix value of 0x81 and
removes it from the qpn before being freed so that the lower 16 bit
number can be reused.

This patch requires minor changes in the IB core and ipoib to facilitate
the creation of accelerated UP QPs.

Link: https://lore.kernel.org/r/20200511160607.173205.11757.stgit@awfm-01.aw.intel.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Remove module parameter for KDETH qpns
Gary Leshner [Mon, 11 May 2020 16:06:00 +0000 (12:06 -0400)]
IB/hfi1: Remove module parameter for KDETH qpns

The module parameter for KDETH qpns is being removed in favor
of always using the default value of 0x80 as the qpn prefix.
Defines have been added for various KDETH values including
the prefix of 0x80.
The reserved range now starts at the base value for KDETH
qpns (0x80) and extends up to and including the last qpn for
other reserved QP prefixed types.
Adjust other QP prefixed define names to match KDETH defined
names.

Link: https://lore.kernel.org/r/20200511160600.173205.27508.stgit@awfm-01.aw.intel.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Add the transmit side of a datagram ipoib RDMA netdev
Gary Leshner [Mon, 11 May 2020 16:05:54 +0000 (12:05 -0400)]
IB/hfi1: Add the transmit side of a datagram ipoib RDMA netdev

This implements the transmit side of the multiple transmit queue RDMA
netdev used to accelerate ipoib.  The receive side remains the ipoib
internal implementation.

The init/unint/open/stop netdev operations are saved off and called by the
versions within the hfi1 netdev in order to initialize the connected mode
resources present in ipoib thus allowing us to switch modes between
datagram and connected.

The datagram queue pair instantiated by the ipoib ulp is used by this
implementation for its queue pair number and to register with multicast.

The above queue pair is not used on transmit other than its qpn as the
verbs layer is skipped and packets are directly submitted to the sdma
engines.

Link: https://lore.kernel.org/r/20200511160554.173205.1369.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Add functions to transmit datagram ipoib packets
Gary Leshner [Mon, 11 May 2020 16:05:48 +0000 (12:05 -0400)]
IB/hfi1: Add functions to transmit datagram ipoib packets

This patch implements the mechanism to accelerate the transmit side of
a multiple transmit queue RDMA netdev by submitting the packets to
the SDMA engine directly instead of sending through the verbs layer.

This patch also changes the UD/SEND_ONLY op to output the entropy value
in byte 0 of deth[1]. UD/SEND_ONLY_WITH_IMMEDIATE uses the previous
behavior with no entropy value being output.

The code in the ipoib rdma netdev which submits tx requests upon
successful submission will call trace_sdma_output_ibhdr to output
the ibhdr to the trace buffer.

Link: https://lore.kernel.org/r/20200511160548.173205.45616.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Add accelerated IP capability bit
Kaike Wan [Mon, 11 May 2020 16:05:41 +0000 (12:05 -0400)]
IB/hfi1: Add accelerated IP capability bit

The accelerated IP capability bit is added to allow users to control
which feature is enabled and disabled.

Link: https://lore.kernel.org/r/20200511160541.173205.96870.stgit@awfm-01.aw.intel.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/efa: Report host information to the device
Gal Pressman [Tue, 12 May 2020 15:22:04 +0000 (18:22 +0300)]
RDMA/efa: Report host information to the device

The host info feature allows the driver to infrom the EFA device
firmware with system configuration for debugging and troubleshooting
purposes.

The host info buffer is passed as an admin command DMA mapped control
buffer, and is unmapped and freed once the command CQE is consumed.

Currently, the setting of host info is done for each device on its
probe. Failing to set the host info for the device shall not disturb the
probe flow, any errors will be discarded.

Link: https://lore.kernel.org/r/20200512152204.93091-3-galpress@amazon.com
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/efa: Fix setting of wrong bit in get/set_feature commands
Gal Pressman [Tue, 12 May 2020 15:22:03 +0000 (18:22 +0300)]
RDMA/efa: Fix setting of wrong bit in get/set_feature commands

When using a control buffer the ctrl_data bit should be set in order to
indicate the control buffer address is valid, not ctrl_data_indirect
which is used when the control buffer itself is indirect.

Fixes: e9c6c5373088 ("RDMA/efa: Add common command handlers")
Link: https://lore.kernel.org/r/20200512152204.93091-2-galpress@amazon.com
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Add init2init as a modify command
Aharon Landau [Wed, 13 May 2020 09:55:50 +0000 (12:55 +0300)]
RDMA/mlx5: Add init2init as a modify command

Missing INIT2INIT entry in the list of modify commands caused DEVX
applications to be unable to modify_qp for this transition state. Add the
MLX5_CMD_OP_INIT2INIT_QP opcode to the list of allowed DEVX opcodes.

Fixes: e662e14d801b ("IB/mlx5: Add DEVX support for modify and query commands")
Link: https://lore.kernel.org/r/20200513095550.211345-1-leon@kernel.org
Signed-off-by: Aharon Landau <aharonl@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Reserve one sge in order to avoid local length error
Lijun Ou [Fri, 8 May 2020 09:45:59 +0000 (17:45 +0800)]
RDMA/hns: Reserve one sge in order to avoid local length error

When rq/srq sge length is smaller than sq sge length, it will produce a
local length error and may cause the bus to hang. Therefore, for rq wqe
and srq wqe, one reserved sge pointing to a reserved mr is used to avoid
this error.

Link: https://lore.kernel.org/r/1588931159-56875-10-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Rename macro for defining hns hardware page size
Xi Wang [Fri, 8 May 2020 09:45:58 +0000 (17:45 +0800)]
RDMA/hns: Rename macro for defining hns hardware page size

Rename the PAGE_ADDR_SHIFT as HNS_HW_PAGE_SHIFT to make code more
readable.

Link: https://lore.kernel.org/r/1588931159-56875-9-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Remove redundant memcpy()
Weihang Li [Fri, 8 May 2020 09:45:57 +0000 (17:45 +0800)]
RDMA/hns: Remove redundant memcpy()

srq_context is a local variables and is only used to get some fields from
buffer of mailbox. It's meaningless to copy mailbox's buffer's contents
back to it.

Link: https://lore.kernel.org/r/1588931159-56875-8-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Store mr len information into mr obj
Lang Cheng [Fri, 8 May 2020 09:45:56 +0000 (17:45 +0800)]
RDMA/hns: Store mr len information into mr obj

The length information should be stored in the struct ib_mr object,
otherwise the length value of a valid mr object would always be 0.

Link: https://lore.kernel.org/r/1588931159-56875-7-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Fix error with to_hr_hem_entries_count()
Weihang Li [Fri, 8 May 2020 09:45:55 +0000 (17:45 +0800)]
RDMA/hns: Fix error with to_hr_hem_entries_count()

For ilog2(x), if x is 0 and not a constant variable, it will return
-1. And there will be an error as below:

 hns3 0000:7d:00.0 hns_0: Local work queue 0x8 catast error, sub_event type is: 2

So modify to_hr_hem_entries_shift() to return 0 if conut is 0.

Fixes: 54d6638765b0 ("RDMA/hns: Optimize WQE buffer size calculating process")
Link: https://lore.kernel.org/r/1588931159-56875-6-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Fix wrong assignment of SRQ's max_wr
Weihang Li [Fri, 8 May 2020 09:45:54 +0000 (17:45 +0800)]
RDMA/hns: Fix wrong assignment of SRQ's max_wr

srq's attribute max_wr should be 1 less than the total count of wqe.

Fixes: ffb1308b88b6 ("RDMA/hns: Move SRQ code to the reasonable place")
Link: https://lore.kernel.org/r/1588931159-56875-5-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Fix assignment to ba_pg_sz of eqe
Wenpeng Liang [Fri, 8 May 2020 09:45:53 +0000 (17:45 +0800)]
RDMA/hns: Fix assignment to ba_pg_sz of eqe

When allocating eq buffer, the size of base address page should be defined
by eqe_ba_pg_sz instead of srqwqe_ba_pg_sz.

Fixes: 477a0a387072 ("RDMA/hns: Optimize 0 hop addressing for EQE buffer")
Link: https://lore.kernel.org/r/1588931159-56875-4-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Fix cmdq parameter of querying pf timer resource
Lang Cheng [Fri, 8 May 2020 09:45:52 +0000 (17:45 +0800)]
RDMA/hns: Fix cmdq parameter of querying pf timer resource

The firmware has reduced the number of descriptions of command
HNS_ROCE_OPC_QUERY_PF_TIMER_RES to 1. The driver needs to adapt, otherwise
the hardware will report error 4(CMD_NEXT_ERR).

Fixes: 0e40dc2f70cd ("RDMA/hns: Add timer allocation support for hip08")
Link: https://lore.kernel.org/r/1588931159-56875-3-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Bugfix for querying qkey
Lijun Ou [Fri, 8 May 2020 09:45:51 +0000 (17:45 +0800)]
RDMA/hns: Bugfix for querying qkey

The qkey queried through the query ud qp verb is a fixed value and it
should be read from qp context.

Fixes: 926a01dc000d ("RDMA/hns: Add QP operations support for hip08 SoC")
Link: https://lore.kernel.org/r/1588931159-56875-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/siw: Replace one-element array and use struct_size() helper
Gustavo A. R. Silva [Tue, 19 May 2020 23:30:18 +0000 (18:30 -0500)]
RDMA/siw: Replace one-element array and use struct_size() helper

The current codebase makes use of one-element arrays in the following
form:

struct something {
    int length;
    u8 data[1];
};

struct something *instance;

instance = kmalloc(sizeof(*instance) + size, GFP_KERNEL);
instance->length = size;
memcpy(instance->data, source, size);

but the preferred mechanism to declare variable-length types such as
these ones is a flexible array member[1][2], introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on. So, replace
the one-element array with a flexible-array member.

Also, make use of the new struct_size() helper to properly calculate the
size of struct siw_pbl.

This issue was found with the help of Coccinelle and, audited and fixed
_manually_.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

Link: https://lore.kernel.org/r/20200519233018.GA6105@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agornbd/rtrs: Pass max segment size from blk user to the rdma library
Danil Kipnis [Tue, 19 May 2020 11:14:19 +0000 (13:14 +0200)]
rnbd/rtrs: Pass max segment size from blk user to the rdma library

When Block Device Layer is disabled, BLK_MAX_SEGMENT_SIZE is undefined.
The rtrs is a transport library and should compile independently of the
block layer. The desired max segment size should be passed down by the
user.

Introduce max_segment_size parameter for the rtrs_clt_open() call.

Fixes: f7a7a5c228d4 ("block/rnbd: client: main functionality")
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Fixes: cb80329c9434 ("RDMA/rtrs: client: private header with client structs and functions")
Fixes: b5c27cdb094e ("RDMA/rtrs: public interface header to establish RDMA connections")
Link: https://lore.kernel.org/r/20200519111419.924170-1-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: server: Fix some error return code
Wei Yongjun [Tue, 19 May 2020 09:19:12 +0000 (09:19 +0000)]
RDMA/rtrs: server: Fix some error return code

Fix to return negative error code -ENOMEM from the some error handling
cases instead of 0, as done elsewhere in this function.

Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Fixes: 91b11610af8d ("RDMA/rtrs: server: sysfs interface functions")
Link: https://lore.kernel.org/r/20200519091912.134358-1-weiyongjun1@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: client: Fix function return on success
Gustavo A. R. Silva [Tue, 19 May 2020 16:36:12 +0000 (11:36 -0500)]
RDMA/rtrs: client: Fix function return on success

Remove the if-statement and return the value contained in _err_,
unconditionally.

Link: https://lore.kernel.org/r/20200519163612.GA6043@embeddedor
Addresses-Coverity-ID: 1493753 ("Identical code for different branches")
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: Fix a couple off by one bugs in rtrs_srv_rdma_done()
Dan Carpenter [Tue, 19 May 2020 15:45:25 +0000 (18:45 +0300)]
RDMA/rtrs: Fix a couple off by one bugs in rtrs_srv_rdma_done()

These > comparisons should be >= to prevent accessing one element beyond
the end of the buffer.

Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20200519154525.GA66801@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: Fix some signedness bugs in error handling
Dan Carpenter [Tue, 19 May 2020 13:32:23 +0000 (16:32 +0300)]
RDMA/rtrs: Fix some signedness bugs in error handling

The problem is that "req->sg_cnt" is an unsigned int so if "nr" is
negative, it gets type promoted to a high positive value and the condition
is false.  This patch fixes it by handling negatives separately.

Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20200519133223.GN2078@kadam
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/srpt: Fix disabling device management
Kamal Heib [Thu, 14 May 2020 11:47:20 +0000 (14:47 +0300)]
RDMA/srpt: Fix disabling device management

Avoid disabling device management for devices that don't support
Management datagrams (MADs) by checking if the "mad_agent" pointer is
initialized before calling ib_modify_port, also fix the error flow in
srpt_refresh_port() to disable device management if
ib_register_mad_agent() fail.

Fixes: 09f8a1486dca ("RDMA/srpt: Fix handling of SR-IOV and iWARP ports")
Link: https://lore.kernel.org/r/20200514114720.141139-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Update mlx5_ib driver name
Shay Drory [Wed, 13 May 2020 09:53:04 +0000 (12:53 +0300)]
RDMA/mlx5: Update mlx5_ib driver name

Current description doesn't include new devices, change it by updating to
have more generic description and remove DRIVER_NAME and DRIVER_VERSION
defines.

Link: https://lore.kernel.org/r/20200513095304.210240-1-leon@kernel.org
Signed-off-by: Shay Drory <shayd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/srpt: Add a newline when printing parameter 'srpt_service_guid' by sysfs
Xiongfeng Wang [Mon, 11 May 2020 07:37:09 +0000 (15:37 +0800)]
RDMA/srpt: Add a newline when printing parameter 'srpt_service_guid' by sysfs

When I cat module parameter 'srpt_service_guid', it displays as follows.
It is better to add a newline for easy reading.

[root@hulk-202 ~]# cat /sys/module/ib_srpt/parameters/srpt_service_guid
0x0205cdfffe8346b9[root@hulk-202 ~]#

Link: https://lore.kernel.org/r/1589182629-27743-1-git-send-email-wangxiongfeng2@huawei.com
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Consolidate ib_create_srq flows
Jason Gunthorpe [Wed, 6 May 2020 08:24:39 +0000 (11:24 +0300)]
RDMA/core: Consolidate ib_create_srq flows

The uverbs layer largely duplicate the code in ib_create_srq(), with the
slight difference that it passes in a udata. Move all the code together
into ib_create_srq_user() and provide an inline for kernel users, similar
to other create calls.

Link: https://lore.kernel.org/r/20200506082444.14502-6-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/uverbs: Fix create WQ to use the given user handle
Yishai Hadas [Wed, 6 May 2020 08:24:42 +0000 (11:24 +0300)]
RDMA/uverbs: Fix create WQ to use the given user handle

Fix create WQ to use the given user handle, in addition dropped some
duplicated code from this flow.

Fixes: fd3c7904db6e ("IB/core: Change idr objects to use the new schema")
Fixes: f213c0527210 ("IB/uverbs: Add WQ support")
Link: https://lore.kernel.org/r/20200506082444.14502-9-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/uverbs: Cleanup wq/srq context usage from uverbs layer
Yishai Hadas [Wed, 6 May 2020 08:24:38 +0000 (11:24 +0300)]
RDMA/uverbs: Cleanup wq/srq context usage from uverbs layer

Both wq_context and srq_context are some leftover from the past in uverbs
layer, they are not really in use, drop them.

Link: https://lore.kernel.org/r/20200506082444.14502-5-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoMAINTAINERS: Add maintainers for RNBD/RTRS modules
Jack Wang [Mon, 11 May 2020 13:51:31 +0000 (15:51 +0200)]
MAINTAINERS: Add maintainers for RNBD/RTRS modules

Danil and I will maintain RNBD/RTRS modules.

Link: https://lore.kernel.org/r/20200511135131.27580-26-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoblock/rnbd: a bit of documentation
Jack Wang [Mon, 11 May 2020 13:51:30 +0000 (15:51 +0200)]
block/rnbd: a bit of documentation

README with description of major sysfs entries, sysfs documentation
are moved to ABI dir as Bart suggested.

Link: https://lore.kernel.org/r/20200511135131.27580-25-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoblock/rnbd: include client and server modules into kernel compilation
Jack Wang [Mon, 11 May 2020 13:51:29 +0000 (15:51 +0200)]
block/rnbd: include client and server modules into kernel compilation

Add rnbd Makefile, Kconfig and also corresponding lines into upper block
layer files.

Link: https://lore.kernel.org/r/20200511135131.27580-24-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoblock/rnbd: server: sysfs interface functions
Jack Wang [Mon, 11 May 2020 13:51:28 +0000 (15:51 +0200)]
block/rnbd: server: sysfs interface functions

This is the sysfs interface to rnbd mapped devices on server side:

  /sys/class/rnbd-server/ctl/devices/<device_name>/
    |- block_dev
    |  *** link pointing to the corresponding block device sysfs entry
    |
    |- sessions/<session-name>/
    |  *** sessions directory
       |
       |- read_only
       |  *** is devices mapped as read only
       |
       |- mapping_path
          *** relative device path provided by the client during mapping

Link: https://lore.kernel.org/r/20200511135131.27580-23-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoblock/rnbd: server: functionality for IO submitting to block dev
Jack Wang [Mon, 11 May 2020 13:51:27 +0000 (15:51 +0200)]
block/rnbd: server: functionality for IO submitting to block dev

This provides helper functions for IO submitting to block dev.

Link: https://lore.kernel.org/r/20200511135131.27580-22-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoblock/rnbd: server: main functionality
Jack Wang [Mon, 11 May 2020 13:51:26 +0000 (15:51 +0200)]
block/rnbd: server: main functionality

This is main functionality of rnbd-server module, which handles RTRS
events and rnbd protocol requests, like map (open) or unmap (close)
device.  Also server side is responsible for processing incoming IBTRS IO
requests and forward them to local mapped devices.

Link: https://lore.kernel.org/r/20200511135131.27580-21-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoblock/rnbd: server: private header with server structs and functions
Jack Wang [Mon, 11 May 2020 13:51:25 +0000 (15:51 +0200)]
block/rnbd: server: private header with server structs and functions

This header describes main structs and functions used by rnbd-server
module, namely structs for managing sessions from different clients and
mapped (opened) devices.

Link: https://lore.kernel.org/r/20200511135131.27580-20-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoblock/rnbd: client: sysfs interface functions
Jack Wang [Mon, 11 May 2020 13:51:24 +0000 (15:51 +0200)]
block/rnbd: client: sysfs interface functions

This is the sysfs interface to rnbd block devices on client side:

  /sys/class/rnbd-client/ctl/
    |- map_device
    |  *** maps remote device
    |
    |- devices/
       *** all mapped devices

  /sys/block/rnbd<N>/rnbd/
    |- unmap_device
    |  *** unmaps device
    |
    |- state
    |  *** device state
    |
    |- session
    |  *** session name
    |
    |- mapping_path
       *** path of the dev that was mapped on server

Link: https://lore.kernel.org/r/20200511135131.27580-19-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoblock/rnbd: client: main functionality
Jack Wang [Mon, 11 May 2020 13:51:23 +0000 (15:51 +0200)]
block/rnbd: client: main functionality

This is main functionality of rnbd-client module, which provides interface
to map remote device as local block device /dev/rnbd<N> and feeds RTRS
with IO requests.

Link: https://lore.kernel.org/r/20200511135131.27580-18-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoblock/rnbd: client: private header with client structs and functions
Jack Wang [Mon, 11 May 2020 13:51:22 +0000 (15:51 +0200)]
block/rnbd: client: private header with client structs and functions

This header describes main structs and functions used by rnbd-client
module, mainly for managing RNBD sessions and mapped block devices,
creating and destroying sysfs entries.

Link: https://lore.kernel.org/r/20200511135131.27580-17-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoblock/rnbd: private headers with rnbd protocol structs and helpers
Jack Wang [Mon, 11 May 2020 13:51:21 +0000 (15:51 +0200)]
block/rnbd: private headers with rnbd protocol structs and helpers

These are common private headers with rnbd protocol structures, logging,
sysfs and other helper functions, which are used on both client and server
sides.

Link: https://lore.kernel.org/r/20200511135131.27580-16-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: a bit of documentation
Jack Wang [Mon, 11 May 2020 13:51:20 +0000 (15:51 +0200)]
RDMA/rtrs: a bit of documentation

README with description of major sysfs entries, sysfs documentation has
been moved to ABI dir as suggested by Bart.

Link: https://lore.kernel.org/r/20200511135131.27580-15-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: include client and server modules into kernel compilation
Jack Wang [Mon, 11 May 2020 13:51:19 +0000 (15:51 +0200)]
RDMA/rtrs: include client and server modules into kernel compilation

Add rtrs Makefile, Kconfig and also corresponding lines into upper layer
infiniband/ulp files.

Link: https://lore.kernel.org/r/20200511135131.27580-14-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: server: sysfs interface functions
Jack Wang [Mon, 11 May 2020 13:51:18 +0000 (15:51 +0200)]
RDMA/rtrs: server: sysfs interface functions

This is the sysfs interface to rtrs sessions on server side:

  /sys/class/rtrs-server/<SESS-NAME>/
    *** rtrs session accepted from a client peer
    |
    |- paths/<SRC@DST>/
       *** established paths from a client in a session
       |
       |- disconnect
       |  *** disconnect path
       |
       |- hca_name
       |  *** HCA name
       |
       |- hca_port
       |  *** HCA port
       |
       |- stats/
          *** current path statistics
          |
  |- rdma

Link: https://lore.kernel.org/r/20200511135131.27580-13-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: server: statistics functions
Jack Wang [Mon, 11 May 2020 13:51:17 +0000 (15:51 +0200)]
RDMA/rtrs: server: statistics functions

This introduces set of functions used on server side to account statistics
of RDMA data sent/received.

Link: https://lore.kernel.org/r/20200511135131.27580-12-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: server: main functionality
Jack Wang [Mon, 11 May 2020 13:51:16 +0000 (15:51 +0200)]
RDMA/rtrs: server: main functionality

This is main functionality of rtrs-server module, which accepts set of
RDMA connections (so called rtrs session), creates/destroys sysfs entries
associated with rtrs session and notifies upper layer
(user of RTRS API) about RDMA requests or link events.

Link: https://lore.kernel.org/r/20200511135131.27580-11-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: server: private header with server structs and functions
Jack Wang [Mon, 11 May 2020 13:51:15 +0000 (15:51 +0200)]
RDMA/rtrs: server: private header with server structs and functions

This header describes main structs and functions used by rtrs-server
module, mainly for accepting rtrs sessions, creating/destroying sysfs
entries, accounting statistics on server side.

Link: https://lore.kernel.org/r/20200511135131.27580-10-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: client: sysfs interface functions
Jack Wang [Mon, 11 May 2020 13:51:14 +0000 (15:51 +0200)]
RDMA/rtrs: client: sysfs interface functions

This is the sysfs interface to rtrs sessions on client side:

  /sys/class/rtrs-client/<SESS-NAME>/
    *** rtrs session created by rtrs_clt_open() API call
    |
    |- max_reconnect_attempts
    |  *** number of reconnect attempts for session
    |
    |- add_path
    |  *** adds another connection path into rtrs session
    |
    |- paths/<SRC@DST>/
       *** established paths to server in a session
       |
       |- disconnect
       |  *** disconnect path
       |
       |- reconnect
       |  *** reconnect path
       |
       |- remove_path
       |  *** remove current path
       |
       |- state
       |  *** retrieve current path state
       |
       |- hca_port
       |  *** HCA port number
       |
       |- hca_name
       |  *** HCA name
       |
       |- stats/
          *** current path statistics
          |
  |- cpu_migration
  |- rdma
  |- reconnects
  |- reset_all

Link: https://lore.kernel.org/r/20200511135131.27580-9-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: client: statistics functions
Jack Wang [Mon, 11 May 2020 13:51:13 +0000 (15:51 +0200)]
RDMA/rtrs: client: statistics functions

This introduces set of functions used on client side to account statistics
of RDMA data sent/received, amount of IOs inflight, latency, cpu
migrations, etc.  Almost all statistics are collected using percpu
variables.

Link: https://lore.kernel.org/r/20200511135131.27580-8-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: client: main functionality
Jack Wang [Mon, 11 May 2020 13:51:12 +0000 (15:51 +0200)]
RDMA/rtrs: client: main functionality

This is main functionality of rtrs-client module, which manages set of
RDMA connections for each rtrs session, does multipathing, load balancing
and failover of RDMA requests.

Link: https://lore.kernel.org/r/20200511135131.27580-7-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: client: private header with client structs and functions
Jack Wang [Mon, 11 May 2020 13:51:11 +0000 (15:51 +0200)]
RDMA/rtrs: client: private header with client structs and functions

This header describes main structs and functions used by rtrs-client
module, mainly for managing rtrs sessions, creating/destroying sysfs
entries, accounting statistics on client side.

Link: https://lore.kernel.org/r/20200511135131.27580-6-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: core: lib functions shared between client and server modules
Jack Wang [Mon, 11 May 2020 13:51:10 +0000 (15:51 +0200)]
RDMA/rtrs: core: lib functions shared between client and server modules

This is a set of library functions existing as a rtrs-core module, used by
client and server modules.

Mainly these functions wrap IB and RDMA calls and provide a bit higher
abstraction for implementing of RTRS protocol on client or server sides.

Link: https://lore.kernel.org/r/20200511135131.27580-5-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: private headers with rtrs protocol structs and helpers
Jack Wang [Mon, 11 May 2020 13:51:09 +0000 (15:51 +0200)]
RDMA/rtrs: private headers with rtrs protocol structs and helpers

These are common private headers with rtrs protocol structures, logging,
sysfs and other helper functions, which are used on both client and server
sides.

Link: https://lore.kernel.org/r/20200511135131.27580-4-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rtrs: public interface header to establish RDMA connections
Jack Wang [Mon, 11 May 2020 13:51:08 +0000 (15:51 +0200)]
RDMA/rtrs: public interface header to establish RDMA connections

Introduce public header which provides set of API functions to establish
RDMA connections from client to server machine using RTRS protocol, which
manages RDMA connections for each session, does multipathing and load
balancing.

Main functions for client (active) side:

 rtrs_clt_open() - Creates set of RDMA connections incapsulated
                    in IBTRS session and returns pointer on RTRS
    session object.
 rtrs_clt_close() - Closes RDMA connections associated with RTRS
                     session.
 rtrs_clt_request() - Requests zero-copy RDMA transfer to/from
                       server.

Main functions for server (passive) side:

 rtrs_srv_open() - Starts listening for RTRS clients on specified
                    port and invokes RTRS callbacks for incoming
    RDMA requests or link events.
 rtrs_srv_close() - Closes RTRS server context.

Link: https://lore.kernel.org/r/20200511135131.27580-3-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agosysfs: export sysfs_remove_file_self()
Jack Wang [Mon, 11 May 2020 13:51:07 +0000 (15:51 +0200)]
sysfs: export sysfs_remove_file_self()

Function is going to be used in transport over RDMA module in subsequent
patches, so export it to GPL modules.

Link: https://lore.kernel.org/r/20200511135131.27580-2-danil.kipnis@cloud.ionos.com
Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org
[jwang: extend the commit message]
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Fix query_srq_cmd() function
Leon Romanovsky [Wed, 13 May 2020 10:08:09 +0000 (13:08 +0300)]
RDMA/mlx5: Fix query_srq_cmd() function

The output buffer used in mlx5_cmd_exec_inout() was wrongly changed from
pre-allocated srq_out pointer to an input "out" point. That leads to
unpredictable results in the get_srqc() call later.

Fixes: 31578defe4eb ("RDMA/mlx5: Update mlx5_ib to use new cmd interface")
Link: https://lore.kernel.org/r/20200513100809.246315-1-leon@kernel.org
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Add support for drop action in DV steering
Daria Velikovsky [Mon, 4 May 2020 05:42:27 +0000 (08:42 +0300)]
RDMA/mlx5: Add support for drop action in DV steering

When drop action is used the matching packet will stop processing in
steering and will be dropped. This functionality will allow users to drop
matching packets.

Link: https://lore.kernel.org/r/20200504054227.271486-1-leon@kernel.org
Signed-off-by: Daria Velikovsky <daria@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Add support in steering default miss
Maor Gottlieb [Mon, 4 May 2020 05:30:12 +0000 (08:30 +0300)]
RDMA/mlx5: Add support in steering default miss

User can configure default miss rule in order to skip matching in the user
domain and forward the packet to the kernel steering domain.  When user
requests a default miss rule, we add steering rule to forward the traffic
to the next namespace.

Link: https://lore.kernel.org/r/20200504053012.270689-5-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Refactor DV create flow
Maor Gottlieb [Mon, 4 May 2020 05:30:11 +0000 (08:30 +0300)]
RDMA/mlx5: Refactor DV create flow

Move part of the code that get the destinations into function so the code
will be more readable.  In addition change the variables definition to be
in reversed christmas tree.

Link: https://lore.kernel.org/r/20200504053012.270689-4-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoMerge branch 'mellanox/mlx5-next' into rdma.git for/next
Jason Gunthorpe [Wed, 13 May 2020 18:54:19 +0000 (15:54 -0300)]
Merge branch 'mellanox/mlx5-next' into rdma.git for/next

From the mlx5-next branch at
  git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Required for dependencies in following patches

* branch 'mellanox/mlx5-next':
  net/mlx5: Add support in forward to namespace
  {IB/net}/mlx5: Simplify don't trap code
  net/mlx5: Replace zero-length array with flexible-array

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agonet/mlx5: Add support in forward to namespace
Maor Gottlieb [Mon, 4 May 2020 05:30:10 +0000 (08:30 +0300)]
net/mlx5: Add support in forward to namespace

Currently, fs_core supports rule of forward the traffic
to continue matching in the next priority, now we add support
to forward the traffic matching in the next namespace.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years ago{IB/net}/mlx5: Simplify don't trap code
Maor Gottlieb [Mon, 4 May 2020 05:30:09 +0000 (08:30 +0300)]
{IB/net}/mlx5: Simplify don't trap code

The fs_core already supports creation of rules with multiple
actions/destinations. Refactor fs_core to handle the case
when don't trap rule is created with destination. Adapt the
calling code in the driver.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agoIB/rdmavt: Replace zero-length array with flexible-array
Gustavo A. R. Silva [Thu, 7 May 2020 18:53:42 +0000 (13:53 -0500)]
IB/rdmavt: Replace zero-length array with flexible-array

The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

Link: https://lore.kernel.org/r/20200507185342.GA14476@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Increment the refcount inside cm_find_listen()
Jason Gunthorpe [Wed, 6 May 2020 07:47:01 +0000 (10:47 +0300)]
RDMA/cm: Increment the refcount inside cm_find_listen()

All callers need the 'get', so do it in a central place before returning
the pointer.

Link: https://lore.kernel.org/r/20200506074701.9775-11-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Remove needless cm_id variable
Jason Gunthorpe [Wed, 6 May 2020 07:47:00 +0000 (10:47 +0300)]
RDMA/cm: Remove needless cm_id variable

Just put the expression in the only reader

Link: https://lore.kernel.org/r/20200506074701.9775-10-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Remove the cm_free_id() wrapper function
Jason Gunthorpe [Wed, 6 May 2020 07:46:59 +0000 (10:46 +0300)]
RDMA/cm: Remove the cm_free_id() wrapper function

Just call xa_erase directly during cm_destroy_id()

Link: https://lore.kernel.org/r/20200506074701.9775-9-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Make find_remote_id() return a cm_id_private
Jason Gunthorpe [Wed, 6 May 2020 07:46:58 +0000 (10:46 +0300)]
RDMA/cm: Make find_remote_id() return a cm_id_private

The only caller doesn't care about the timewait, so acquire and return the
cm_id_private from the function.

Link: https://lore.kernel.org/r/20200506074701.9775-8-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Add a note explaining how the timewait is eventually freed
Jason Gunthorpe [Wed, 6 May 2020 07:46:57 +0000 (10:46 +0300)]
RDMA/cm: Add a note explaining how the timewait is eventually freed

The way the cm_timewait_info is converted into a work and then freed
is very subtle and surprising, add a note clarifying the lifetime
here.

Link: https://lore.kernel.org/r/20200506074701.9775-7-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Pass the cm_id_private into cm_cleanup_timewait
Jason Gunthorpe [Wed, 6 May 2020 07:46:56 +0000 (10:46 +0300)]
RDMA/cm: Pass the cm_id_private into cm_cleanup_timewait

Also rename it to cm_remove_remote(). This function now removes the
tracking of the remote ID/QPN in the redblack trees from a cm_id_private.

Replace a open-coded version with a call. The open coded version was
deleting only the remote_id, however at this call site the qpn can not
have been in the RB tree either, so the cm_remove_remote() will do the
same.

Link: https://lore.kernel.org/r/20200506074701.9775-6-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Pull duplicated code into cm_queue_work_unlock()
Jason Gunthorpe [Wed, 6 May 2020 07:46:55 +0000 (10:46 +0300)]
RDMA/cm: Pull duplicated code into cm_queue_work_unlock()

While unlocking a spinlock held by the caller is a disturbing pattern,
this extensively duplicated code is even worse. Pull all the duplicates
into a function and explain the purpose of the algorithm.

The on creation side call in cm_req_handler() which is different has been
micro-optimized on the basis that the work_count == -1 during creation,
remove that and just use the normal function.

Link: https://lore.kernel.org/r/20200506074701.9775-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Remove unused store to ret in cm_rej_handler
Danit Goldberg [Wed, 6 May 2020 07:46:54 +0000 (10:46 +0300)]
RDMA/cm: Remove unused store to ret in cm_rej_handler

The 'goto out' label doesn't read ret, so don't set it.

Link: https://lore.kernel.org/r/20200506074701.9775-4-leon@kernel.org
Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Remove return code from add_cm_id_to_port_list
Jason Gunthorpe [Wed, 6 May 2020 07:46:53 +0000 (10:46 +0300)]
RDMA/cm: Remove return code from add_cm_id_to_port_list

This cannot happen, all callers pass in one of the two pointers. Use
a WARN_ON guard instead.

Link: https://lore.kernel.org/r/20200506074701.9775-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/addr: Mark addr_resolve as might_sleep()
Jason Gunthorpe [Wed, 6 May 2020 07:46:52 +0000 (10:46 +0300)]
RDMA/addr: Mark addr_resolve as might_sleep()

Under one path through ib_nl_fetch_ha() this calls nlmsg_new(GFP_KERNEL)
which is a sleeping call. This is a very rare path, so mark fetch_ha() and
the module external entry point that conditionally calls through to
fetch_ha() as might_sleep().

Link: https://lore.kernel.org/r/20200506074701.9775-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Combine enable flags of qp
Lang Cheng [Tue, 5 May 2020 10:30:07 +0000 (18:30 +0800)]
RDMA/hns: Combine enable flags of qp

It's easier to understand and maintain enable flags of qp using a single
field in type of unsigned long than defining a field for every flags in
the structure hns_roce_qp, and we can add new flags for features more
conveniently in the future.

Link: https://lore.kernel.org/r/1588674607-25337-4-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Extend capability flags for HIP08_C
Weihang Li [Tue, 5 May 2020 10:30:06 +0000 (18:30 +0800)]
RDMA/hns: Extend capability flags for HIP08_C

12 bits is not enough for HIP08_C, so extend a new field in length of 16
bits for it.

Link: https://lore.kernel.org/r/1588674607-25337-3-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/ucma: Return stable IB device index as identifier
Leon Romanovsky [Mon, 4 May 2020 13:25:41 +0000 (16:25 +0300)]
RDMA/ucma: Return stable IB device index as identifier

The librdmacm uses node_guid as identifier to correlate between IB devices
and CMA devices. However FW resets cause to such "connection" to be lost
and require from the user to restart its application.

Extend UCMA to return IB device index, which is stable identifier.

Link: https://lore.kernel.org/r/20200504132541.355710-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agonet/mlx5: Replace zero-length array with flexible-array
Gustavo A. R. Silva [Thu, 7 May 2020 18:59:35 +0000 (13:59 -0500)]
net/mlx5: Replace zero-length array with flexible-array

The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agoRDMA/mlx5: Remove duplicated assignment to variable rcqe_sz
Colin Ian King [Thu, 7 May 2020 15:16:10 +0000 (16:16 +0100)]
RDMA/mlx5: Remove duplicated assignment to variable rcqe_sz

The variable rcqe_sz is being unnecessarily assigned twice, fix this by
removing one of the duplicates.

Fixes: 8bde2c509e40 ("RDMA/mlx5: Update all DRIVER QP places to use QP subtype")
Link: https://lore.kernel.org/r/20200507151610.52636-1-colin.king@canonical.com
Addresses-Coverity: ("Evaluation order violation")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Allow only raw Ethernet QPs when RoCE isn't enabled
Mark Bloch [Wed, 6 May 2020 07:16:02 +0000 (10:16 +0300)]
RDMA/mlx5: Allow only raw Ethernet QPs when RoCE isn't enabled

When operating in switchdev mode or using devlink to disable RoCE
only raw Ethernet QPs are allowed to be created.

When in switchdev mode this can lead to passing an invalid port number
as part of the modify qp firmware cmd and will lead to a syndrome
reported back to the user, such as:

 * mlx5_cmd_check:803:(pid 50148): RST2INIT_QP(0x502) op_mod(0x0) failed,
   status bad parameter(0x3), syndrome (0x177405).

Internal UD QP might be used to test for write combining support (even if
externally we report RoCE as disabled) check for that specific flag and
allow is specifically.

Fixes: b5ca15ad7e61 ("IB/mlx5: Add proper representors support")
Link: https://lore.kernel.org/r/20200506071602.7177-3-leon@kernel.org
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Assign profile before calling stages
Mark Bloch [Wed, 6 May 2020 07:16:01 +0000 (10:16 +0300)]
RDMA/mlx5: Assign profile before calling stages

Assign the profile to the IB device before executing stages. This will
allow to check which profile is being used from within a stage.

Link: https://lore.kernel.org/r/20200506071602.7177-2-leon@kernel.org
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Move all WR logic from qp.c to separate file
Leon Romanovsky [Wed, 6 May 2020 06:55:13 +0000 (09:55 +0300)]
RDMA/mlx5: Move all WR logic from qp.c to separate file

Split qp.c by removing all WR logic to separate file.

Link: https://lore.kernel.org/r/20200506065513.4668-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Refactor mlx5_post_send() to improve readability
Max Gurtovoy [Wed, 6 May 2020 06:55:12 +0000 (09:55 +0300)]
RDMA/mlx5: Refactor mlx5_post_send() to improve readability

Add small helpers in order to avoid code duplication and improve code
readability. Decrease the amount of code in the gigantic post_send
function and divide it to readable methods that will help in code
maintenance in the future.

Link: https://lore.kernel.org/r/20200506065513.4668-3-leon@kernel.org
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Update mlx5_ib to use new cmd interface
Leon Romanovsky [Wed, 6 May 2020 06:55:11 +0000 (09:55 +0300)]
RDMA/mlx5: Update mlx5_ib to use new cmd interface

Reuse newly introduced mlx5_cmd_exec_in() and mlx5_cmd_exec_inout() to
reduce code duplication in mlx5_ib module.

Link: https://lore.kernel.org/r/20200506065513.4668-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Remove redundant assignment of caps
Wenpeng Liang [Thu, 30 Apr 2020 10:31:31 +0000 (18:31 +0800)]
RDMA/hns: Remove redundant assignment of caps

These caps are assigned in query_pf_caps() or set_default_caps(), and
should not be assigned out of these two functions.

Link: https://lore.kernel.org/r/1588242691-12913-4-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Adjust lp_pktn_ini dynamically
Weihang Li [Thu, 30 Apr 2020 10:31:30 +0000 (18:31 +0800)]
RDMA/hns: Adjust lp_pktn_ini dynamically

lp_pktn_ini means the number of loopback slice packets for long messages,
it should depend on MTU(fixed to 4096B currently) and max size of SQ
inline.

Link: https://lore.kernel.org/r/1588242691-12913-3-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Fix comments with non-English symbols
Weihang Li [Thu, 30 Apr 2020 10:31:29 +0000 (18:31 +0800)]
RDMA/hns: Fix comments with non-English symbols

There is a comments with some chinese semicolons that cause encoding
issues each time hns_roc_hw_v2.h was modified from a IDE. So fix this by
using correct symbols.

Link: https://lore.kernel.org/r/1588242691-12913-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Optimize SRQ buffer size calculating process
Xi Wang [Tue, 28 Apr 2020 11:03:43 +0000 (19:03 +0800)]
RDMA/hns: Optimize SRQ buffer size calculating process

Optimize the SRQ's WQE buffer parameters calculating process to make the
codes more readable by using new functions about multi-hop addressing to
calculating capabilities of SRQ.

Link: https://lore.kernel.org/r/1588071823-40200-6-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Move SRQ code to the reasonable place
Yixian Liu [Tue, 28 Apr 2020 11:03:42 +0000 (19:03 +0800)]
RDMA/hns: Move SRQ code to the reasonable place

Just move the SRQ related code to more reasonable place, and unify format
of some prints.

Link: https://lore.kernel.org/r/1588071823-40200-5-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Optimize WQE buffer size calculating process
Xi Wang [Tue, 28 Apr 2020 11:03:41 +0000 (19:03 +0800)]
RDMA/hns: Optimize WQE buffer size calculating process

Optimize the QP's WQE buffer parameters calculating process to make the
codes more readable mainly by merging calculation of extended sge space of
kernel and userspace. In addition, add some inline functions to simply
codes about multi-hop addressing.

Link: https://lore.kernel.org/r/1588071823-40200-4-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Remove unused MTT functions
Xi Wang [Tue, 28 Apr 2020 11:03:40 +0000 (19:03 +0800)]
RDMA/hns: Remove unused MTT functions

The MTT (Memory Translate Table) interface is no longer used to configure
the buffer address to BT (Base Address Table) that requires driver
mapping.  Because the MTT is not compatible with multi-hop addressing of
the hip08, it is replaced by MTR (Memory Translate Region) interface, and
all the MTT functions should be removed.

Link: https://lore.kernel.org/r/1588071823-40200-3-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Optimize PBL buffer allocation process
Xi Wang [Tue, 28 Apr 2020 11:03:39 +0000 (19:03 +0800)]
RDMA/hns: Optimize PBL buffer allocation process

PBL table has its own implementation for multi-hop addressing currently,
but for the hardware, all table's addressing use the same logic, there is
no need to implement repeatedly. So optimize the PBL buffer allocation
process by using the mtr's interfaces.

Link: https://lore.kernel.org/r/1588071823-40200-2-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Set UDP source port based on the grh.flow_label
Mark Zhang [Mon, 4 May 2020 05:19:35 +0000 (08:19 +0300)]
RDMA/mlx5: Set UDP source port based on the grh.flow_label

Calculate UDP source port based on the grh.flow_label. If grh.flow_label
is not valid, we will use minimal supported UDP source port.

Link: https://lore.kernel.org/r/20200504051935.269708-6-leon@kernel.org
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>