git.proxmox.com Git - mirror_ubuntu-bionic-kernel.git/log

net: hns3: add handling vlan tag offload in bd

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch deals with the vlan tag information between
sk_buff and rx/tx bd.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 9699cffe97fee6eb957a23b58d814a2e62dd43e9)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: add ethtool related offload command

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch adds offload command related to "ethtool -K".

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 052ece6dc19c610a48c1cedeee1b2f1478838e99)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: add vlan offload config command

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch adds vlan offload config commands, initializes
the rules of tx/rx vlan tag handle for hw.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5f6ea83fc9784f1edc8b11238722604fb36fa7ad)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: add a mask initialization for mac_vlan table

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch sets vlan masked, in order to avoid the received
packets being filtered.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 7564094cd91472c7db215b9440d2664274736897)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: get rss_size_max from configuration but not hardcode

BugLink: http://bugs.launchpad.net/bugs/1756097
Add configuration for rss_size_max in hdev but not hardcode it.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Mingguang Qu <qumingguang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 0e7a40cdac0a2aa7c6946a571b8428b3307bed85)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: free the ring_data structrue when change tqps

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch fixes a memory leak problems in change tqps process,
the function hns3_uninit_all_ring and hns3_init_all_ring
may be called many times.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Mingguang Qu <qumingguang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 99fdf6b1cadf41bb253408589788f025027274f3)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: change the returned tqp number by ethtool -x

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch modifies the return data of get_rxnfc, it will return
the current handle's rss_size but not the total tqp number.
because the tc_size has been change to the log2 of roundup
power of two of rss_size.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Mingguang Qu <qumingguang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f0e98c97fa8f169952ca2c1187f9270902c1800d)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: add support to modify tqps number

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch adds the support to change tqps number for PF driver
by using ehtool -L command.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Mingguang Qu <qumingguang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 09f2af6405b8cd4b2d91ec88188df6f06da38853)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: add support to query tqps number

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch adds the support to query tqps number for PF driver
by using ehtool -l command.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Mingguang Qu <qumingguang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 482d2e9c1cc7c0e154464e3e052db09e5e62541f)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: Add mailbox interrupt handling to PF driver

BugLink: http://bugs.launchpad.net/bugs/1756097
All PF mailbox events are conveyed through a common interrupt
(vector 0). This interrupt vector is shared by reset and mailbox.

This patch adds the handling of mailbox interrupt event and its
deferred processing in context to a separate mailbox task.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c1a81619d73a436f4b796b44c2711c68aec9b787)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: Change PF to add ring-vect binding & resetQ to mailbox

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch is required to support ring-vector binding and reset
of TQPs requested by the VF driver to the PF driver. Mailbox
handler is added with corresponding VF commands/messages to
handle the request.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 84e095d64ed974bd46351650fc8188d372b89fde)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: Add mailbox support to PF driver

BugLink: http://bugs.launchpad.net/bugs/1756097
Command queue provides the provision of Mailbox command which
can be used for communication between PF and VF. PF handles
messages from various VFs for fetching various information like,
queue, vlan, link status related etc. It also handles the request
from various VFs to perform certain privileged operations.

This patch adds the support of a message handler for handling
such various command requests from VF.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit dde1a86e93cadf9b17ec0a95a78c99505c48fd83)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: Unified HNS3 {VF|PF} Ethernet Driver for hip08 SoC

BugLink: http://bugs.launchpad.net/bugs/1756097
Most of the NAPI handling interface, skb buffer management,
management of the RX/TX descriptors, ethool interface etc.
has quite a bit of code which is common to VF and PF driver.

This patch makes the exisitng PF's HNS3 ENET driver as the
common ENET driver for both Virtual & Physical Function. This
will help in reduction of redundancy and better management of
code.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 424eb834a9be49273c4b32d0d6395dfdbe768a1a)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: Add HNS3 VF driver to kernel build framework

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch introduces the new Makefiles and updates existing
Makefiles required to build the HNS3 Virtual Function driver.
This also updates the Kconfig for introduction of new menuconfig
entries related to VF driver.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e963cb789a29b890678b58ef7da5d7c497510b7e)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch adds the support of hardware compatibiltiy layer to the
HNS3 VF Driver. This layer implements various {set|get} operations
over MAC address for a virtual port, RSS related configuration,
fetches the link status info from PF, does various VLAN related
configuration over the virtual port, queries the statistics from
the hardware etc.

This layer can directly interact with hardware through the
IMP(Integrated Mangement Processor) interface or can use mailbox
to interact with the PF driver.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e2cb1dec9779ba2d89302a653eb0abaeb8682196)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: Add mailbox support to VF driver

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch adds the support of the mailbox to the VF driver. The
mailbox shall be used as an interface to communicate with the
PF driver for various purposes like {set|get} MAC related
operations, reset, link status etc. The mailbox supports both
synchronous and asynchronous command send to PF driver.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b11a0bb231f3d83429c5e88451ca85ce27c4a9dd)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: Add HNS3 VF IMP(Integrated Management Proc) cmd interface

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch adds support of command interface for communication with
the IMP(Integrated Management Processor) for HNS3 Virtual Function
Driver.

Each VF has support of CQP(Command Queue Pair) ring interface.
Each CQP consis of send queue CSQ and receive queue CRQ.
There are various commands a VF may support, like to query frimware
version, TQP management, statistics, interrupt related, mailbox etc.

This also contains code to initialize the command queue, manage the
command queue descriptors and Rx/Tx protocol with the command processor
in the form of various commands/results and acknowledgements.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fedd0c15d2885e393d4ef4db818b462c3bbfc337)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: Refactors the requested reset & pending reset handling code

BugLink: http://bugs.launchpad.net/bugs/1756097
In exisiting code, the way to detect if driver/client reset should
be executed or if hardware should be be soft resetted was overly
complex.

Existing code use to read the interrupt status register from task
context to figure out if the interrupt source event was reset and
then use clear the interrupt source for reset while waiting for the
hardware to finish the reset. This behaviour again was confusing
and overly complex in terms of the flow.

This patch simplifies the handling of the requested reset and the
pending reset(i.e. reset which have already been asserted by the
software and hardware has acknowledged back to driver that it is
processing the hardware reset through interrupt)

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f2f432f2c37942504f491d9375ecd4fee977dfac)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: Add reset service task for handling reset requests

BugLink: http://bugs.launchpad.net/bugs/1756097
Existing common service task was being used to service the reset
requests. This patch tries to make the handling of reset cleaner
by separating task to handle the reset requests. This might in
turn help in adapting similar handling approach for other
interrupt events like mailbox, sharing vector 0 interrupt.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit cb1b9f77c48fb014da7d020f1395eca4fdfcbd9a)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

net: hns3: Refactor of the reset interrupt handling logic

BugLink: http://bugs.launchpad.net/bugs/1756097
The reset interrupt event shares common miscellaneous interrupt
Vector 0. In the existing reset interrupt handling we disable
the Vector 0 interrupt in misc interrupt handler and re-enable
them later in context to common service task.

This also means other event sources like mailbox would also be
deferred or if the interrupt event was due to mailbox(which shall
be supported for VF soon), it could delay the reset handling.

This patch reorganizes the reset interrupt handling logic and
makes it more fair to other events.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit ca1d7669b714d35903fc5dfbf54c990c6122a1d4)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Set the guid for hip08 RoCE device

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch assign a guid(Global Unique identifer) value to the hip08
device.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit d4994d2f1f7a7b24622f990d4bb437eacf69b816)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Update the verbs of polling for completion

BugLink: http://bugs.launchpad.net/bugs/1756097
If the port is a RoCEv2 port, the remote port address and QP information
which returned for UD will be modified.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit 2eade675351b0ef9e054ccb62334efd716aa853c)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Assign zero for pkey_index of wc in hip08

BugLink: http://bugs.launchpad.net/bugs/1756097
Because pkey is fixed for hip08 RoCE, it needs to assign zero for
pkey_index of wc. otherwise, it will happen an error when establishing
connection by communication management mechanism.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit 6c1f08b347f64de38460d5b3d1eb4a30028869cb)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Fill sq wqe context of ud type in hip08

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch mainly configure the fields of sq wqe of ud type when posting
wr of gsi qp type.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit 7bdee4158b3778493d81bf09105568c91abce110)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Add gsi qp support for modifying qp in hip08

BugLink: http://bugs.launchpad.net/bugs/1756097
It needs to Assign the values for some fields in qp context when qp type
is gsi qp type in hip08.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit 0fa95a9a71025a966434573e6b8d7d5c6b50116d)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Create gsi qp in hip08

BugLink: http://bugs.launchpad.net/bugs/1756097
The gsi qp and rc qp use the same qp context structure and the created
flow, only differentiate them by qpn and qp type.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit b66efc932067cb653da6bcbe03fe3f7bc53bf30d)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Assign the correct value for tx_cqn

BugLink: http://bugs.launchpad.net/bugs/1756097
When modifying qp from init to init, it need to assign the cqn of send cq
for tx cqn field of qp context. Otherwise, it will cause a mistake when
the send and recv cq sizes are different.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit 6d13b869ea1e929842e9d758867e4b7473759f9d)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Fix endian problems around imm_data and rkey

BugLink: http://bugs.launchpad.net/bugs/1756097
This matches the changes made recently to the userspace hns
driver when it was made sparse clean.

See rdma-core commit bffd380cfe56 ("libhns: Make the provider sparse
clean")

wc->imm_data is not used in the kernel so this change has no practical
impact.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit ccb8a29e7db29f2b889300a80bd0684d646f796b)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Assign dest_qp when deregistering mr

BugLink: http://bugs.launchpad.net/bugs/1756097
It needs to create eight reserve QPs for resolving
a bug of hip06. When deregistering mr, it will issue
a rdma write for every reserve QPs.

When modify qp from init to rtr, it needs to set
the value of dest_qp_num. Otherwise, it will lead
an error of freeing mr.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit 107013ce7b28c3d7395bc0299c0fe3ce12f15b6f)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Fix QP state judgement before sending work requests

BugLink: http://bugs.launchpad.net/bugs/1756097
The QP can accept send work requests only when the QP is
in the states that allow them to be submitted.

This patch updates the QP state judgement based on the
specification.

Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit 10bd2ade4b5fe0e8d0f3a4c4b20f008c4cb06bd2)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Filter for zero length of sge in hip08 kernel mode

BugLink: http://bugs.launchpad.net/bugs/1756097
When the length of sge is zero, the driver need to filter it

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit 52e3b42a2f587912f0558cd4989da160d8b1304a)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Set access flags of hip08 RoCE

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch refactors the code of setting access flags
for RDMA operation as well as adds the scene when
attr->max_dest_rd_atomic is zero.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit ace1c5416b37bc9d925f91ee163c47fa6aa16781)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Update the usage of sr_max and rr_max field

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch fixes the usage with sr_max filed and rr_max of qp
context when modify qp. Its modifications include:
1. Adjust location of filling sr_max filed of qpc
2. Only assign the number of responder resource if
   IB_QP_MAX_DEST_RD_ATOMIC bit is set
3. Only assign the number of outstanding resource if
   IB_QP_MAX_QP_RD_ATOMIC
4. Fix the assgin algorithms for the field of sr_max
   and rr_max of qp context

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit 4f3f7a704b3bff9e4eb322ab3c989b505f7562eb)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Add rq inline data support for hip08 RoCE

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch mainly implement rq inline data feature for hip08
RoCE in kernel mode.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit 0009c2dbe8a47008a11abca04da2db57f9eea6a8)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Add detailed comments for mb() call

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch adds more detailed comments when we call the
memory barrier function, such as rmb, wmb and mb. Three
mb() callers are deleted since they are unnecessary.

Suggested-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit 4044a3f482a3373ea5379da47c04ebecb9a3f133)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Add eq support of hip08

BugLink: http://bugs.launchpad.net/bugs/1756097
This patch adds eq support for hip08. The eq table can
be multi-hop addressed.

Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Reviewed-by: Lijun Ou <oulijun@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit a5073d6054f75d7c94b3354206eec4b804d2fbd4)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

RDMA/hns: Refactor eq code for hip06

BugLink: http://bugs.launchpad.net/bugs/1756097
Considering the compatibility of supporting hip08's eq
process and possible changes of data structure, this patch
refactors the eq code structure of hip06.

We move all the eq process code for hip06 from hns_roce_eq.c
into hns_roce_hw_v1.c, and also for hns_roce_eq.h. With
these changes, it will be convenient to add the eq support
for later hardware version.

Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Reviewed-by: Lijun Ou <oulijun@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
(cherry picked from commit b16f8188472efac75f5afc9a8226d635a9075672)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

UBUNTU: [Config]: enable RAS_EXTN and ARM_SDE_INTERFACE

BugLink: http://bugs.launchpad.net/bugs/1756096
enable RAS extension and SDEI for ARM64

Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

KVM: arm64: Emulate RAS error registers and set HCR_EL2's TERR & TEA

BugLink: http://bugs.launchpad.net/bugs/1756096
ARMv8.2 adds a new bit HCR_EL2.TEA which routes synchronous external
aborts to EL2, and adds a trap control bit HCR_EL2.TERR which traps
all Non-secure EL1&0 error record accesses to EL2.

This patch enables the two bits for the guest OS, guaranteeing that
KVM takes external aborts and traps attempts to access the physical
error registers.

ERRIDR_EL1 advertises the number of error records, we return
zero meaning we can treat all the other registers as RAZ/WI too.

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
[removed specific emulation, use trap_raz_wi() directly for everything,
rephrased parts of the commit message]
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 558daf693e0c7ea118dbfb9539aa5a72f34bad2e)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

KVM: arm64: Handle RAS SErrors from EL2 on guest exit

BugLink: http://bugs.launchpad.net/bugs/1756096
We expect to have firmware-first handling of RAS SErrors, with errors
notified via an APEI method. For systems without firmware-first, add
some minimal handling to KVM.

There are two ways KVM can take an SError due to a guest, either may be a
RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.

The current SError from EL2 code unmasks SError and tries to fence any
pending SError into a single instruction window. It then leaves SError
unmasked.

With the v8.2 RAS Extensions we may take an SError for a 'corrected'
error, but KVM is only able to handle SError from EL2 if they occur
during this single instruction window...

The RAS Extensions give us a new instruction to synchronise and
consume SErrors. The RAS Extensions document (ARM DDI0587),
'2.4.1 ESB and Unrecoverable errors' describes ESB as synchronising
SError interrupts generated by 'instructions, translation table walks,
hardware updates to the translation tables, and instruction fetches on
the same PE'. This makes ESB equivalent to KVMs existing
'dsb, mrs-daifclr, isb' sequence.

Use the alternatives to synchronise and consume any SError using ESB
instead of unmasking and taking the SError. Set ARM_EXIT_WITH_SERROR_BIT
in the exit_code so that we can restart the vcpu if it turns out this
SError has no impact on the vcpu.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 0067df413bd9d7e9ee3a78ece1e1a93535378862)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

KVM: arm64: Handle RAS SErrors from EL1 on guest exit

BugLink: http://bugs.launchpad.net/bugs/1756096
We expect to have firmware-first handling of RAS SErrors, with errors
notified via an APEI method. For systems without firmware-first, add
some minimal handling to KVM.

There are two ways KVM can take an SError due to a guest, either may be a
RAS error: we exit the guest due to an SError routed to EL2 by HCR_EL2.AMO,
or we take an SError from EL2 when we unmask PSTATE.A from __guest_exit.

For SError that interrupt a guest and are routed to EL2 the existing
behaviour is to inject an impdef SError into the guest.

Add code to handle RAS SError based on the ESR. For uncontained and
uncategorized errors arm64_is_fatal_ras_serror() will panic(), these
errors compromise the host too. All other error types are contained:
For the fatal errors the vCPU can't make progress, so we inject a virtual
SError. We ignore contained errors where we can make progress as if
we're lucky, we may not hit them again.

If only some of the CPUs support RAS the guest will see the cpufeature
sanitised version of the id registers, but we may still take RAS SError
on this CPU. Move the SError handling out of handle_exit() into a new
handler that runs before we can be preempted. This allows us to use
this_cpu_has_cap(), via arm64_is_ras_serror().

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 3368bd809764d3ef0810e16c1e1531fec32e8d8e)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

KVM: arm64: Save ESR_EL2 on guest SError

BugLink: http://bugs.launchpad.net/bugs/1756096
When we exit a guest due to an SError the vcpu fault info isn't updated
with the ESR. Today this is only done for traps.

The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's
fault_info with the ESR on SError so that handle_exit() can determine
if this was a RAS SError and decode its severity.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit c60590b552bdf682043579b9b965e6224fbf65d9)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

KVM: arm64: Save/Restore guest DISR_EL1

BugLink: http://bugs.launchpad.net/bugs/1756096
If we deliver a virtual SError to the guest, the guest may defer it
with an ESB instruction. The guest reads the deferred value via DISR_EL1,
but the guests view of DISR_EL1 is re-mapped to VDISR_EL2 when HCR_EL2.AMO
is set.

Add the KVM code to save/restore VDISR_EL2, and make it accessible to
userspace as DISR_EL1.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit c773ae2b34760a1ae409614aa31cdded81a645a5)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.

BugLink: http://bugs.launchpad.net/bugs/1756096
Prior to v8.2's RAS Extensions, the HCR_EL2.VSE 'virtual SError' feature
generated an SError with an implementation defined ESR_EL1.ISS, because we
had no mechanism to specify the ESR value.

On Juno this generates an all-zero ESR, the most significant bit 'ISV'
is clear indicating the remainder of the ISS field is invalid.

With the RAS Extensions we have a mechanism to specify this value, and the
most significant bit has a new meaning: 'IDS - Implementation Defined
Syndrome'. An all-zero SError ESR now means: 'RAS error: Uncategorized'
instead of 'no valid ISS'.

Add KVM support for the VSESR_EL2 register to specify an ESR value when
HCR_EL2.VSE generates a virtual SError. Change kvm_inject_vabt() to
specify an implementation-defined value.

We only need to restore the VSESR_EL2 value when HCR_EL2.VSE is set, KVM
save/restores this bit during __{,de}activate_traps() and hardware clears the
bit once the guest has consumed the virtual-SError.

Future patches may add an API (or KVM CAP) to pend a virtual SError with
a specified ESR.

Cc: Dongjiu Geng <gengdongjiu@huawei.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 4715c14bc136687bb79d12e24aafdc0f38786eb7)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

KVM: arm/arm64: mask/unmask daif around VHE guests

BugLink: http://bugs.launchpad.net/bugs/1756096
Non-VHE systems take an exception to EL2 in order to world-switch into the
guest. When returning from the guest KVM implicitly restores the DAIF
flags when it returns to the kernel at EL1.

With VHE none of this exception-level jumping happens, so KVMs
world-switch code is exposed to the host kernel's DAIF values, and KVM
spills the guest-exit DAIF values back into the host kernel.
On entry to a guest we have Debug and SError exceptions unmasked, KVM
has switched VBAR but isn't prepared to handle these. On guest exit
Debug exceptions are left disabled once we return to the host and will
stay this way until we enter user space.

Add a helper to mask/unmask DAIF around VHE guests. The unmask can only
happen after the hosts VBAR value has been synchronised by the isb in
__vhe_hyp_call (via kvm_call_hyp()). Masking could be as late as
setting KVMs VBAR value, but is kept here for symmetry.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(backported from commit 4f5abad9e826bd579b0661efa32682d9c9bc3fa8)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

arm64: kernel: Prepare for a DISR user

BugLink: http://bugs.launchpad.net/bugs/1756096
KVM would like to consume any pending SError (or RAS error) after guest
exit. Today it has to unmask SError and use dsb+isb to synchronise the
CPU. With the RAS extensions we can use ESB to synchronise any pending
SError.

Add the necessary macros to allow DISR to be read and converted to an
ESR.

We clear the DISR register when we enable the RAS cpufeature, and the
kernel has not executed any ESB instructions. Any value we find in DISR
must have belonged to firmware. Executing an ESB instruction is the
only way to update DISR, so we can expect firmware to have handled
any deferred SError. By the same logic we clear DISR in the idle path.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(backported from commit 68ddbf09ec5a888ec850edd7e7438d2daf069c56)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

arm64: Unconditionally enable IESB on exception entry/return for firmware-first

BugLink: http://bugs.launchpad.net/bugs/1756096
ARM v8.2 has a feature to add implicit error synchronization barriers
whenever the CPU enters or returns from an exception level. Add this to the
features we always enable. CPUs that don't support this feature will treat
the bit as RES0.

This feature causes RAS errors that are not yet visible to software to
become pending SErrors. We expect to have firmware-first RAS support
so synchronised RAS errors will be take immediately to EL3.
Any system without firmware-first handling of errors will take the SError
either immediatly after exception return, or when we unmask SError after
entry.S's work.

Adding IESB to the ELx flags causes it to be enabled by KVM and kexec
too.

Platform level RAS support may require additional firmware support.

Cc: Christoffer Dall <christoffer.dall@linaro.org>
Suggested-by: Will Deacon <will.deacon@arm.com>
Link: https://www.spinics.net/lists/kvm-arm/msg28192.html
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit f751daa4f9d3da07e2777ea0c1ba2d58ff2c860f)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

arm64: kernel: Survive corrected RAS errors notified by SError

BugLink: http://bugs.launchpad.net/bugs/1756096
Prior to v8.2, SError is an uncontainable fatal exception. The v8.2 RAS
extensions use SError to notify software about RAS errors, these can be
contained by the Error Syncronization Barrier.

An ACPI system with firmware-first may use SError as its 'SEI'
notification. Future patches may add code to 'claim' this SError as a
notification.

Other systems can distinguish these RAS errors from the SError ESR and
use the AET bits and additional data from RAS-Error registers to handle
the error. Future patches may add this kernel-first handling.

Without support for either of these we will panic(), even if we received
a corrected error. Add code to decode the severity of RAS errors. We can
safely ignore contained errors where the CPU can continue to make
progress. For all other errors we continue to panic().

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 6bf0dcfd713563bd2e13ceb53217305c28a8aa5f)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

arm64: cpufeature: Detect CPU RAS Extentions

BugLink: http://bugs.launchpad.net/bugs/1756096
ARM's v8.2 Extentions add support for Reliability, Availability and
Serviceability (RAS). On CPUs with these extensions system software
can use additional barriers to isolate errors and determine if faults
are pending. Add cpufeature detection.

Platform level RAS support may require additional firmware support.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
[Rebased added config option, reworded commit message]
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(backported from commit 64c02720ea3598bf5143b672274d923a941b8053)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

arm64: sysreg: Move to use definitions for all the SCTLR bits

BugLink: http://bugs.launchpad.net/bugs/1756096
__cpu_setup() configures SCTLR_EL1 using some hard coded hex masks,
and el2_setup() duplicates some this when setting RES1 bits.

Lets make this the same as KVM's hyp_init, which uses named bits.

First, we add definitions for all the SCTLR_EL{1,2} bits, the RES{1,0}
bits, and those we want to set or clear.

Add a build_bug checks to ensures all bits are either set or clear.
This means we don't need to preserve endian-ness configuration
generated elsewhere.

Finally, move the head.S and proc.S users of these hard-coded masks
over to the macro versions.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 7a00d68ebe5f07cb1db17e7fedfd031f0d87e8bb)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

firmware: arm_sdei: Discover SDEI support via ACPI

BugLink: http://bugs.launchpad.net/bugs/1756096
SDEI defines a new ACPI table to indicate the presence of the interface.
The conduit is discovered in the same way as PSCI.

For ACPI we need to create the platform device ourselves as SDEI doesn't
have an entry in the DSDT.

The SDEI platform device should be created after ACPI has been initialised
so that we can parse the table, but before GHES devices are created, which
may register SDE events if they use SDEI as their notification type.

Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 677a60bd2003ff5517a0b502365112531446a3e3)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

arm64: acpi: Remove __init from acpi_psci_use_hvc() for use by SDEI

BugLink: http://bugs.launchpad.net/bugs/1756096
SDEI inherits the 'use hvc' bit that is also used by PSCI. PSCI does all
its initialisation early, SDEI does its late.

Remove the __init annotation from acpi_psci_use_hvc().

Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit fa31ab77ced9fbab87fbac4fca3682009b7f9be2)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

firmware: arm_sdei: add support for CPU private events

BugLink: http://bugs.launchpad.net/bugs/1756096
Private SDE events are per-cpu, and need to be registered and enabled
on each CPU.

Hide this detail from the caller by adapting our {,un}register and
{en,dis}able calls to send an IPI to each CPU if the event is private.

CPU private events are unregistered when the CPU is powered-off, and
re-registered when the CPU is brought back online. This saves bringing
secondary cores back online to call private_reset() on shutdown, kexec
and resume from hibernate.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit f92b5462a2f22d13a75dc663f7b2fac16a3e61cb)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

firmware: arm_sdei: Add support for CPU and system power states

BugLink: http://bugs.launchpad.net/bugs/1756096
When a CPU enters an idle lower-power state or is powering off, we
need to mask SDE events so that no events can be delivered while we
are messing with the MMU as the registered entry points won't be valid.

If the system reboots, we want to unregister all events and mask the CPUs.
For kexec this allows us to hand a clean slate to the next kernel
instead of relying on it to call sdei_{private,system}_data_reset().

For hibernate we unregister all events and re-register them on restore,
in case we restored with the SDE code loaded at a different address.
(e.g. KASLR).

Add all the notifiers necessary to do this. We only support shared events
so all events are left registered and enabled over CPU hotplug.

Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
[catalin.marinas@arm.com: added CPU_PM_ENTER_FAILED case]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit da351827240e1705cca64bb8ae526f0ce1068048)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

arm64: kernel: Add arch-specific SDEI entry code and CPU masking

BugLink: http://bugs.launchpad.net/bugs/1756096
The Software Delegated Exception Interface (SDEI) is an ARM standard
for registering callbacks from the platform firmware into the OS.
This is typically used to implement RAS notifications.

Such notifications enter the kernel at the registered entry-point
with the register values of the interrupted CPU context. Because this
is not a CPU exception, it cannot reuse the existing entry code.
(crucially we don't implicitly know which exception level we interrupted),

Add the entry point to entry.S to set us up for calling into C code. If
the event interrupted code that had interrupts masked, we always return
to that location. Otherwise we pretend this was an IRQ, and use SDEI's
complete_and_resume call to return to vbar_el1 + offset.

This allows the kernel to deliver signals to user space processes. For
KVM this triggers the world switch, a quick spin round vcpu_run, then
back into the guest, unless there are pending signals.

Add sdei_mask_local_cpu() calls to the smp_send_stop() code, this covers
the panic() code-path, which doesn't invoke cpuhotplug notifiers.

Because we can interrupt entry-from/exit-to another EL, we can't trust the
value in sp_el0 or x29, even if we interrupted the kernel, in this case
the code in entry.S will save/restore sp_el0 and use the value in
__entry_task.

When we have VMAP stacks we can interrupt the stack-overflow test, which
stirs x0 into sp, meaning we have to have our own VMAP stacks. For now
these are allocated when we probe the interface. Future patches will add
refcounting hooks to allow the arch code to allocate them lazily.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit f5df26961853d6809d704cedcaf082c57f635a64)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

arm64: uaccess: Add PAN helper

BugLink: http://bugs.launchpad.net/bugs/1756096
Add __uaccess_{en,dis}able_hw_pan() helpers to set/clear the PSTATE.PAN
bit.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit e1281f56f114f3a945bf9ec30698bd3caa59d322)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

arm64: Add vmap_stack header file

BugLink: http://bugs.launchpad.net/bugs/1756096
Today the arm64 arch code allocates an extra IRQ stack per-cpu. If we
also have SDEI and VMAP stacks we need two extra per-cpu VMAP stacks.

Move the VMAP stack allocation out to a helper in a new header file.
This avoids missing THREADINFO_GFP, or getting the all-important alignment
wrong.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit ed8b20d457d72e9e2a30533b436fdb4ea1c70b38)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

firmware: arm_sdei: Add driver for Software Delegated Exceptions

BugLink: http://bugs.launchpad.net/bugs/1756096
The Software Delegated Exception Interface (SDEI) is an ARM standard
for registering callbacks from the platform firmware into the OS.
This is typically used to implement firmware notifications (such as
firmware-first RAS) or promote an IRQ that has been promoted to a
firmware-assisted NMI.

Add the code for detecting the SDEI version and the framework for
registering and unregistering events. Subsequent patches will add the
arch-specific backend code and the necessary power management hooks.

Only shared events are supported, power management, private events and
discovery for ACPI systems will be added by later patches.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit ad6eb31ef90355993eb55ff77e0e855ae7d91e4c)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

Docs: dt: add devicetree binding for describing arm64 SDEI firmware

BugLink: http://bugs.launchpad.net/bugs/1756096
The Software Delegated Exception Interface (SDEI) is an ARM standard
for registering callbacks from the platform firmware into the OS.
This is typically used to implement RAS notifications, or from an
IRQ that has been promoted to a firmware-assisted NMI.

Add a new devicetree binding to describe the SDE firmware interface.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 86f04f640058143388f56c048f86e66ea5204ae2)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

KVM: arm64: Stop save/restoring host tpidr_el1 on VHE

BugLink: http://bugs.launchpad.net/bugs/1756096
Now that a VHE host uses tpidr_el2 for the cpu offset we no longer
need KVM to save/restore tpidr_el1. Move this from the 'common' code
into the non-vhe code. While we're at it, on VHE we don't need to
save the ELR or SPSR as kernel_entry in entry.S will have pushed these
onto the kernel stack, and will restore them from there. Move these
to the non-vhe code as we need them to get back to the host.

Finally remove the always-copy-tpidr we hid in the stage2 setup
code, cpufeature's enable callback will do this for VHE, we only
need KVM to do it for non-vhe. Add the copy into kvm-init instead.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 1f742679c33bc083722cb0b442a95d458c491b56)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

arm64: alternatives: use tpidr_el2 on VHE hosts

BugLink: http://bugs.launchpad.net/bugs/1756096
Now that KVM uses tpidr_el2 in the same way as Linux's cpu_offset in
tpidr_el1, merge the two. This saves KVM from save/restoring tpidr_el1
on VHE hosts, and allows future code to blindly access per-cpu variables
without triggering world-switch.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 6d99b68933fbcf51f84fcbba49246ce1209ec193)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

KVM: arm64: Change hyp_panic()s dependency on tpidr_el2

BugLink: http://bugs.launchpad.net/bugs/1756096
Make tpidr_el2 a cpu-offset for per-cpu variables in the same way the
host uses tpidr_el1. This lets tpidr_el{1,2} have the same value, and
on VHE they can be the same register.

KVM calls hyp_panic() when anything unexpected happens. This may occur
while a guest owns the EL1 registers. KVM stashes the vcpu pointer in
tpidr_el2, which it uses to find the host context in order to restore
the host EL1 registers before parachuting into the host's panic().

The host context is a struct kvm_cpu_context allocated in the per-cpu
area, and mapped to hyp. Given the per-cpu offset for this CPU, this is
easy to find. Change hyp_panic() to take a pointer to the
struct kvm_cpu_context. Wrap these calls with an asm function that
retrieves the struct kvm_cpu_context from the host's per-cpu area.

Copy the per-cpu offset from the hosts tpidr_el1 into tpidr_el2 during
kvm init. (Later patches will make this unnecessary for VHE hosts)

We print out the vcpu pointer as part of the panic message. Add a back
reference to the 'running vcpu' in the host cpu context to preserve this.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit c97e166e54b662717d20ec2e36761758d2b6a7c2)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

KVM: arm/arm64: Convert kvm_host_cpu_state to a static per-cpu allocation

BugLink: http://bugs.launchpad.net/bugs/1756096
kvm_host_cpu_state is a per-cpu allocation made from kvm_arch_init()
used to store the host EL1 registers when KVM switches to a guest.

Make it easier for ASM to generate pointers into this per-cpu memory
by making it a static allocation.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 36989e7fd386a9a5822c48691473863f8fbb404d)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

KVM: arm64: Store vcpu on the stack during __guest_enter()

BugLink: http://bugs.launchpad.net/bugs/1756096
KVM uses tpidr_el2 as its private vcpu register, which makes sense for
non-vhe world switch as only KVM can access this register. This means
vhe Linux has to use tpidr_el1, which KVM has to save/restore as part
of the host context.

If the SDEI handler code runs behind KVMs back, it mustn't access any
per-cpu variables. To allow this on systems with vhe we need to make
the host use tpidr_el2, saving KVM from save/restoring it.

__guest_enter() stores the host_ctxt on the stack, do the same with
the vcpu.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 32b03d1059667a39e089c45ee38ec9c16332430f)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

PCI/DPC: Enable DPC only if AER is available

BugLink: http://bugs.launchpad.net/bugs/1756094
The "Determination of DPC Control" implementation note in PCIe r4.0, sec
6.1.10, recommends the operating system always link DPC control to the
control of AER, as the two functionalities are strongly connected.

To avoid conflicts over whether platform firmware or the OS controls DPC,
enable DPC only if AER is enabled in the OS, and the device's error
handling does not have firmware-first AER handling.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
(cherry picked from commit eed85ff4c0da72640dcf7c0737c5a08bca2958e7)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

PCI/AER: Return error if AER is not supported

BugLink: http://bugs.launchpad.net/bugs/1756094
get_device_error_info() reads error information from registers in the AER
capability. If we call it for a device that has no AER capability, it
should return an error, but previously it returned success.

Return 0 (error) if the device doesn't have an AER capability.

Signed-off-by: Keith Busch <keith.busch@intel.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
(cherry picked from commit 0f6f1d9fca4ad91ce9b30dc0aa847b0947786261)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

PCI: Make PCI_SCAN_ALL_PCIE_DEVS work for Root as well as Downstream Ports

BugLink: http://bugs.launchpad.net/bugs/1756094
PCIe Downstream Ports normally have only a Device 0 below them.  To
optimize enumeration, we don't scan for other devices *unless* the
PCI_SCAN_ALL_PCIE_DEVS flag is set by set by quirks or the
"pci=pcie_scan_all" kernel parameter.

Previously PCI_SCAN_ALL_PCIE_DEVS only affected scanning below Switch
Downstream Ports, not Root Ports.

But the "Nemo" system, also known as the AmigaOne X1000, has a PA Semi Root
Port whose link leads to an AMD/ATI SB600 South Bridge.  The Root Port is a
PCIe device, of course, but the SB600 contains only conventional PCI
devices with no visible PCIe port.

Simplify and restructure only_one_child() so that we scan for all possible
devices below Root Ports as well as Switch Downstream Ports when
PCI_SCAN_ALL_PCIE_DEVS is set.

This is enough to make Nemo work with "pci=pcie_scan_all".  We would also
like to add a quirk to set PCI_SCAN_ALL_PCIE_DEVS automatically on Nemo so
users wouldn't have to use the "pci=pcie_scan_all" parameter, but we don't
have that yet.

Link: https://lkml.kernel.org/r/CAErSpo55Q8Q=5p6_+uu7ahnw+53ibVDNRXxrzRV9QnUr_9EUfw@mail.gmail.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=198057
Reported-and-Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
(cherry picked from commit d57f0b8c81393e7105331ac037fa465d5a45c65f)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

PCI/ASPM: Unexport internal ASPM interfaces

BugLink: http://bugs.launchpad.net/bugs/1756094
Several of the interfaces defined in include/linux/pci-aspm.h are used only
internally from the PCI core:

  pcie_aspm_init_link_state()
  pcie_aspm_exit_link_state()
  pcie_aspm_pm_state_change()
  pcie_aspm_powersave_config_link()
  pcie_aspm_create_sysfs_dev_files()
  pcie_aspm_remove_sysfs_dev_files()

Move these to the internal drivers/pci/pci.h header so they don't clutter
the driver interface.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
(cherry picked from commit 7d8e7d19b095ae70b1ca483ca36e7985a108abe5)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

PCI/ASPM: Enable Latency Tolerance Reporting when supported

BugLink: http://bugs.launchpad.net/bugs/1756094
Enable Latency Tolerance Reporting (LTR). Note that LTR must be enabled in
the Root Port first, and must not be enabled in any downstream device
unless the Root Port and all intermediate Switches also support LTR.
See PCIe r3.1, sec 6.18.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
(cherry picked from commit c46fd358070f22ba68d6e74c22016a33b914c20a)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

PCI/ASPM: Calculate LTR_L1.2_THRESHOLD from device characteristics

BugLink: http://bugs.launchpad.net/bugs/1756094
Per PCIe r3.1, sec 5.5.1, LTR_L1.2_THRESHOLD determines whether we enter
the L1.2 Link state: if L1.2 is enabled and downstream devices have
reported that they can tolerate latency of at least LTR_L1.2_THRESHOLD, we
must enter L1.2 when CLKREQ# is de-asserted.

The implication is that LTR_L1.2_THRESHOLD is the time required to
transition the Link from L0 to L1.2 and back to L0, and per sec 5.5.3.3.1,
Figures 5-16 and 5-17, it appears that the absolute minimum time for those
transitions would be T(POWER_OFF) + T(L1.2) + T(POWER_ON) + T(COMMONMODE).

Therefore, compute LTR_L1.2_THRESHOLD as:

    2us T(POWER_OFF)
  + 4us T(L1.2)
  + T(POWER_ON)
  + T(COMMONMODE)
  = LTR_L1.2_THRESHOLD

Previously we set LTR_L1.2_THRESHOLD to a fixed value of 163840ns
(163.84us):

  #define LTR_L1_2_THRESHOLD_BITS     ((1 << 21) | (1 << 23) | (1 << 30))
  ((1 << 21) | (1 << 23) | (1 << 30)) = 0x40a00000
  LTR_L1.2_THRESHOLD_Value = (0x40a00000 & 0x03ff0000) >> 16 = 0xa0 = 160
  LTR_L1.2_THRESHOLD_Scale = (0x40a00000 & 0xe0000000) >> 29 = 0x2 (* 1024ns)
  LTR_L1.2_THRESHOLD = 160 * 1024ns = 163840ns

Obviously this doesn't account for the circuit characteristics of different
implementations.

Note that while firmware may enable LTR, Linux itself currently does not
enable LTR.  When L1.2 is enabled but LTR is not, LTR_L1.2_THRESHOLD is
ignored and we always enter L1.2 when it is enabled and CLKREQ# is
de-asserted.  So this patch should not have any effect unless firmware
enables LTR.

Fixes: f1f0366dd6be ("PCI/ASPM: Calculate and save the L1.2 timing parameters")
Link: https://www.coreboot.org/pipermail/coreboot-gerrit/2015-March/021134.html
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Cc: Kenji Chen <kenji.chen@intel.com>
Cc: Patrick Georgi <pgeorgi@google.com>
Cc: Rajat Jain <rajatja@google.com>
(cherry picked from commit 80d7d7a904fac3f8114448dbb8cc9fa253b10120)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

PCI/AER: Skip recovery callbacks for correctable errors from ACPI APEI

BugLink: http://bugs.launchpad.net/bugs/1756094
PCIe correctable errors are corrected by hardware. Software may log them,
but no other software intervention is required.

There are two paths to enter the AER recovery code: (1) the native path
where Linux fields the AER interrupt and reads the AER registers directly,
and (2) the ACPI path where firmware reads the AER registers and hands them
off to Linux via the ACPI APEI path.

The AER do_recovery() function calls driver error reporting callbacks
(error_detected(), mmio_enabled(), resume(), etc), attempts recovery (for
fatal errors), and logs a "AER: Device recovery successful" message.

Since there's nothing to recover for correctable errors, the native path
already skips do_recovery(), so it doesn't call the driver callbacks and or
emit the message. Make the APEI path do the same.

Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
(cherry picked from commit b9f80fdc4244b417154ec30d3bc7ec3e76085634)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

PCI / PM: Support for LEAVE_SUSPENDED driver flag

BugLink: http://bugs.launchpad.net/bugs/1756094
Add support for DPM_FLAG_LEAVE_SUSPENDED to the PCI bus type by
making it (a) set the power.may_skip_resume status bit for devices
that, from its perspective, may be left in suspend after system
wakeup from sleep and (b) return early from pci_pm_resume_noirq()
for devices whose remaining resume callbacks during the transition
under way are going to be skipped by the PM core.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
(cherry picked from commit bd755d770ac78e8eeda05877ba66cc66f151e10e)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

PM / core: Add LEAVE_SUSPENDED driver flag

BugLink: http://bugs.launchpad.net/bugs/1756094
Define and document a new driver flag, DPM_FLAG_LEAVE_SUSPENDED, to
instruct the PM core and middle-layer (bus type, PM domain, etc.)
code that it is desirable to leave the device in runtime suspend
after system-wide transitions to the working state (for example,
the device may be slow to resume and it may be better to avoid
resuming it right away).

Generally, the middle-layer code involved in the handling of the
device is expected to indicate to the PM core whether or not the
device may be left in suspend with the help of the device's
power.may_skip_resume status bit. That has to happen in the "noirq"
phase of the preceding system suspend (or analogous) transition.
The middle layer is then responsible for handling the device as
appropriate in its "noirq" resume callback which is executed
regardless of whether or not the device may be left suspended, but
the other resume callbacks (except for ->complete) will be skipped
automatically by the core if the device really can be left in
suspend.

The additional power.must_resume status bit introduced for the
implementation of this mechanisn is used internally by the PM core
to track the requirement to resume the device (which may depend on
its children etc).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
(backported from commit 0d4b54c6fee87ff60b0bc1007ca487449698468d)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

UBUNTU: SAUCE: scsi: hisi_sas: export device table of v3 hw to userspace

BugLink: http://bugs.launchpad.net/bugs/1756094
Export device table of v3 hw to userspace, or auto probe will fail for
v3 hw.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

UBUNTU: SAUCE: scsi: hisi_sas: config for hip08 ES

BugLink: http://bugs.launchpad.net/bugs/1756094
Do some modifications for configuring for hip08 ES

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: fix a bug in hisi_sas_dev_gone()

BugLink: http://bugs.launchpad.net/bugs/1756094
When device gone, NULL pointer can be accessed in free_device callback
if during SAS controller reset as we clear structure sas_dev prior.

Actually we can only set dev_type as SAS_PHY_UNUSED and not clear
structure sas_dev as all the members of structure sas_dev will be
re-initialized after device found.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 0d762b3af2a5b5095fec18aa4d61f408638aa9ca)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: make local symbol host_attrs static

BugLink: http://bugs.launchpad.net/bugs/1756094
Fixes the following sparse warning:

drivers/scsi/hisi_sas/hisi_sas_main.c:1691:25: warning:
symbol 'host_attrs' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 1e15feacb9d3743ca0b314a6daf8cc59c90b1046)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: Change frame type for SET MAX commands

BugLink: http://bugs.launchpad.net/bugs/1756094
According to ATA protocol, SET MAX commands belong to different frame
types. So judge features field of SET MAX commands to decide which
frame type they belongs to.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 468f4b8d0711146f0075513e6047079a26fc3903)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: add v3 hw suspend and resume

BugLink: http://bugs.launchpad.net/bugs/1756094
For v3 hw SAS, it supports configuring power state from D0 to D3 for entering
Low Power status and power state from D3 to D0 for quit Low Power status.

When power state from D0 to D3, HW will send FLR to clear the registers of
ECAM and BAR space, and when power state from D3 to D0, it will clear the
registers of ECAM space only.

So when suspend, need to do like controller reset (including disable
interrupts/DQ/PHY/BUS), and also release slots after FLR. When resume,
re-config the registers of BAR space.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 4d0951ee70d348b694ce2bbdcc65b684239da4b4)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: re-add the lldd_port_deformed()

BugLink: http://bugs.launchpad.net/bugs/1756094
In function sas_suspend_devices(), it requires callback lldd_port_deformed
callback to be implemented if lldd_port_deformed is implemented.

So add a stub for lldd_port_deformed.

Callback lldd_port_deformed was not required as the port deformation is done
elsewhere in the LLDD.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(backported from commit 336bd78bdabf39dbcee6b41f9628c6e51d1c25b0)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: fix SAS_QUEUE_FULL problem while running IO

BugLink: http://bugs.launchpad.net/bugs/1756094
This patch fix SAS_QUEUE_FULL problem. The test situation is close port while
running IO.

In sas_eh_handle_sas_errors(), SCSI EH will free sas_task of the device if
lldd_I_T_nexus_reset() return TMF_RESP_FUNC_COMPLETE or -ENODEV. But in our
SAS driver, we only free slots of the device when the return value is
TMF_RESP_FUNC_COMPLETE. So if the return value is -ENODEV, the slot resource
will not free any more.

As an solution, we should also free slots of the device in
lldd_I_T_nexus_reset() if the return value is -ENODEV.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 9960a24a1c96a40d6ab984ffefdd0e3003a3377e)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: add internal abort dev in some places

BugLink: http://bugs.launchpad.net/bugs/1756094
We should do internal abort dev before TMF_ABORT_TASK_SET and TMF_LU_RESET.
Because we may only have done internal abort for single IO in the earlier part
of SCSI EH process. Even the internal abort to the single IO, we also don't
know whether it is successful.

Besides, we should release slots of the device in hisi_sas_abort_task_set() if
the abort is successful.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 2a03813123c4beb0b60be6b3b65a6b30f7124579)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: judge result of internal abort

BugLink: http://bugs.launchpad.net/bugs/1756094
Normally, hardware should ensure that internal abort timeout will never
happen. If happen, it would be an SoC failure. What's more, HW will not
process any other commands if an internal abort hasn't return CQ, and they
will time out also.

So, we should judge the result of internal abort in SCSI EH, if it is failed,
we should give up to do TMF/softreset and return failure to the upper layer
directly.

This patch do following things to achieve this:

1. When internal abort timeout happened, we set return value to -EIO in
   hisi_sas_internal_task_abort().

2. If prep_abort() is not support, let hisi_sas_internal_task_abort() return
   TMF_RESP_FUNC_FAILED.

3. If hisi_sas_internal_task_abort() return an negative number, it can be
   thought that it not executed properly or internal abort timeout. Then we
   won't do behind TMF or softreset, and return failure directly.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 813709f2e1e07fa872c05f43801a05828d33a70a)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: do link reset for some CHL_INT2 ints

BugLink: http://bugs.launchpad.net/bugs/1756094
We should do link reset of PHY when identify timeout or STP link timeout. They
are internal events of SOC and are notified to driver through interrupts of
CHL_INT2.

Besides, we should add an delay work to do link reset as it needs sleep. So,
this patch add an new PHY event HISI_PHYE_LINK_RESET for this.

Notes: v2 HW doesn't report the event of STP link timeout. So, we only need
to handle event of identify timeout for v2 HW.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 057c3d1f07617049671a41bf05652d20071eb639)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: use an general way to delay PHY work

BugLink: http://bugs.launchpad.net/bugs/1756094
Use an general way to do delay work for a PHY. Then it will be easier to add
new delayed work for a PHY in future.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit e537b62b0796042e1ab66657c4dab662d19e9f0b)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: add v2 hw port AXI error handling support

BugLink: http://bugs.launchpad.net/bugs/1756094
Add port AXI errors handling for v2 hw. We do host controller reset for such
errors.

Besides, change port muli-bits ECC error handling, and we should also do host
reset for such error. So, this patch put them in the same struct with port AXI
error.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 72f7fc3050d55e9877ecc56f33b7a434fca186f5)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: improve int_chnl_int_v2_hw() consistency with v3 hw

BugLink: http://bugs.launchpad.net/bugs/1756094
Change code format of int_chnl_int_v2_hw() to be consistent with v3 hw to
reduce an tag indent.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit f64715d2837bee8fcd71f3e13acc7f02c9e9d98a)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: add some print to enhance debugging

BugLink: http://bugs.launchpad.net/bugs/1756094
Add some print at some places such as error info and cq of exception IO,
device found etc, and also adjust some log levels.

All this to assist debugging ability.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit f1c88211454ff8063b358f9ebe250f0fe429319c)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: add RAS feature for v3 hw

BugLink: http://bugs.launchpad.net/bugs/1756094
We use PCIe AER to support RAS feature for v3 hw. This driver should do
following two things to support this:

1. Enable RAS interrupts, so that errors can be reported to RAS module.

2. Realize err_handler for sas_v3_pci_driver. Then if non-fatal error is
detected, print error source and try to recover SAS controller.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 1aaf81e0e34988ff56b317b568f92fe6ca447da2)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: change ncq process for v3 hw

BugLink: http://bugs.launchpad.net/bugs/1756094
For v3 hw, each NCQ will return a CQ, so it is no need to acquire IPTT from
ITCT, just acquire it from IPTT field of CQ.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 9f347b2face51d782d1e03f2f05b7c3f93a6dc9a)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: add an mechanism to do reset work synchronously

BugLink: http://bugs.launchpad.net/bugs/1756094
Sometimes it is required to know when the controller reset has completed and
also if it has completed successfully. For such places, we call
hisi_sas_controller_reset() directly before. That may lead to multiple calls
to this function.

This patch create a per-reset structure which contains a completion structure
and status flag to know when the reset completes and also the status. It is
also in hisi_hba.wq to do reset work.

As all host reset works are done in hisi_hba.wq, we don't worry multiple calls
to hisi_sas_controller_reset().

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit e402acdb664134f948b62d13b7db866295689f38)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: modify hisi_sas_dev_gone() for reset

BugLink: http://bugs.launchpad.net/bugs/1756094
Do a couple of changes for when HISI_SAS_RESET_BIT is set for HBA:

- Clearing ITCT is not necessary

- Remove internal abort as it will fail during reset

Flag sas_dev->dev_type is kept as SAS_PHY_UNUSED.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit f8e45ec226e2c00c1da9cf156ea59a159e9b4ea6)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: some optimizations of host controller reset

BugLink: http://bugs.launchpad.net/bugs/1756094
This patch do following optimizations to host controller reset:

1. Unblock scsi requests before rescanning topology, as SCSI command need be
   used if new device is found during rescanning topology.

2. Remove drain_workqueue(hisi_hba->wq) and drain_workqueue(shost->work_q), as
   there is no need to ensure that all PHYs event are done before exiting host
   reset.

3. Improve message print level of host reset. Host reset is an important and
   very few occurrence event. We should know its progress even when not
   debugging.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit fb51e7a8d38484687337f16636c5be9528e00fed)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: optimise port id refresh function

BugLink: http://bugs.launchpad.net/bugs/1756094
Currently refreshing the PHY port id after reset is done in the rescan
topology function, which is quite late in the reset process. It could be moved
earlier in the process, as the port id can be refreshed once the PHYs become
ready.

In addition to this, we should set the hisi_sas_dev port id to 0xff (invalid
port id) if all PHYs of this port remain down for the same device.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a669bdbf4939ac72eff6b3ae33f771a1ef28448c)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: relocate clearing ITCT and freeing device

BugLink: http://bugs.launchpad.net/bugs/1756094
In certain scenarios we may just want to clear the ITCT for a device, and not
free other resources like the SATA bitmap using in v2 hw.

To facilitate this, this patch relocates the code of clearing ITCT from
free_device() to a new hw interface clear_itct(). Then for some hw, we should
not realise free_device() if there's nothing left to do for it.

[mkp: typo]

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 0258141aaab3007949ba0e67c3d28436354429bb)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: ata: enhance the definition of SET MAX feature field value

BugLink: http://bugs.launchpad.net/bugs/1756094
There are two other values for SET MAX feature field according to ata
protocol. So definite them.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit d5c15c2c22a8d4e0e82ca95eac5a6ccd175c0762)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

scsi: hisi_sas: fix dma_unmap_sg() parameter

BugLink: http://bugs.launchpad.net/bugs/1756094
For function dma_unmap_sg(), the <nents> parameter should be number of
elements in the scatterlist prior to the mapping, not after the mapping.

Fix this usage.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit dc1e4730e2b636065628f8427b675788bca83d34)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

UBUNTU: [Config] set NOBP and expoline options for s390

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

s390/entry.S: fix spurious zeroing of r0

BugLink: http://bugs.launchpad.net/bugs/1754580
when a system call is interrupted we might call the critical section
cleanup handler that re-does some of the operations. When we are between
.Lsysc_vtime and .Lsysc_do_svc we might also redo the saving of the
problem state registers r0-r7:

.Lcleanup_system_call:
[...]
0:      # update accounting time stamp
        mvc     __LC_LAST_UPDATE_TIMER(8),__LC_SYNC_ENTER_TIMER
        # set up saved register r11
        lg      %r15,__LC_KERNEL_STACK
        la      %r9,STACK_FRAME_OVERHEAD(%r15)
        stg     %r9,24(%r11)            # r11 pt_regs pointer
        # fill pt_regs
        mvc     __PT_R8(64,%r9),__LC_SAVE_AREA_SYNC
--->    stmg    %r0,%r7,__PT_R0(%r9)

The problem is now, that we might have already zeroed out r0.
The fix is to move the zeroing of r0 after sysc_do_svc.

Reported-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Fixes: 7041d28115e91 ("s390: scrub registers on kernel entry and KVM exit")
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
(cherry picked from commit d3f468963cd6fd6d2aa5e26aed8b24232096d0e1)

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

s390: do not bypass BPENTER for interrupt system calls

BugLink: http://bugs.launchpad.net/bugs/1754580
The system call path can be interrupted before the switch back to the
standard branch prediction with BPENTER has been done. The critical
section cleanup code skips forward to .Lsysc_do_svc and bypasses the
BPENTER. In this case the kernel and all subsequent code will run with
the limited branch prediction.

Fixes: eacf67eb9b32 ("s390: run user space and KVM guests with modified branch prediction")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
(cherry picked from commit d5feec04fe578c8dbd9e2e1439afc2f0af761ed4)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>