git.proxmox.com Git - mirror_ubuntu-zesty-kernel.git/log

ibmveth: Support to enable LSO/CSO for Trunk VEA.

BugLink: http://bugs.launchpad.net/bugs/1692538
Current largesend and checksum offload feature in ibmveth driver,
- Source VM sends the TCP packets with ip_summed field set as
   CHECKSUM_PARTIAL and TCP pseudo header checksum is placed in
   checksum field
- CHECKSUM_PARTIAL flag in SKB will enable ibmveth driver to mark
   "no checksum" and "checksum good" bits in transmit buffer descriptor
   before the packet is delivered to pseries PowerVM Hypervisor
- If ibmveth has largesend capability enabled, transmit buffer descriptors
   are market accordingly before packet is delivered to Hypervisor
   (along with mss value for packets with length > MSS)
- Destination VM's ibmveth driver receives the packet with "checksum good"
   bit set and so, SKB's ip_summed field is set with CHECKSUM_UNNECESSARY
- If "largesend" bit was on, mss value is copied from receive descriptor
   into SKB's gso_size and other flags are appropriately set for
   packets > MSS size
- The packet is now successfully delivered up the stack in destination VM

The offloads described above works fine for TCP communication among VMs in
the same pseries server ( VM A <=> PowerVM Hypervisor <=> VM B )

We are now enabling support for OVS in pseries PowerVM environment. One of
our requirements is to have ibmveth driver configured in "Trunk" mode, when
they are used with OVS. This is because, PowerVM Hypervisor will no more
bridge the packets between VMs, instead the packets are delivered to
IO Server which hosts OVS to bridge them between VMs or to external
networks (flow shown below),
  VM A <=> PowerVM Hypervisor <=> IO Server(OVS) <=> PowerVM Hypervisor
                                                                   <=> VM B
In "IO server" the packet is received by inbound Trunk ibmveth and then
delivered to OVS, which is then bridged to outbound Trunk ibmveth (shown
below),
        Inbound Trunk ibmveth <=> OVS <=> Outbound Trunk ibmveth

In this model, we hit the following issues which impacted the VM
communication performance,

- Issue 1: ibmveth doesn't support largesend and checksum offload features
   when configured as "Trunk". Driver has explicit checks to prevent
   enabling these offloads.

- Issue 2: SYN packet drops seen at destination VM. When the packet
   originates, it has CHECKSUM_PARTIAL flag set and as it gets delivered to
   IO server's inbound Trunk ibmveth, on validating "checksum good" bits
   in ibmveth receive routine, SKB's ip_summed field is set with
   CHECKSUM_UNNECESSARY flag. This packet is then bridged by OVS (or Linux
   Bridge) and delivered to outbound Trunk ibmveth. At this point the
   outbound ibmveth transmit routine will not set "no checksum" and
   "checksum good" bits in transmit buffer descriptor, as it does so only
   when the ip_summed field is CHECKSUM_PARTIAL. When this packet gets
   delivered to destination VM, TCP layer receives the packet with checksum
   value of 0 and with no checksum related flags in ip_summed field. This
   leads to packet drops. So, TCP connections never goes through fine.

- Issue 3: First packet of a TCP connection will be dropped, if there is
   no OVS flow cached in datapath. OVS while trying to identify the flow,
   computes the checksum. The computed checksum will be invalid at the
   receiving end, as ibmveth transmit routine zeroes out the pseudo
   checksum value in the packet. This leads to packet drop.

- Issue 4: ibmveth driver doesn't have support for SKB's with frag_list.
   When Physical NIC has GRO enabled and when OVS bridges these packets,
   OVS vport send code will end up calling dev_queue_xmit, which in turn
   calls validate_xmit_skb.
   In validate_xmit_skb routine, the larger packets will get segmented into
   MSS sized segments, if SKB has a frag_list and if the driver to which
   they are delivered to doesn't support NETIF_F_FRAGLIST feature.

This patch addresses the above four issues, thereby enabling end to end
largesend and checksum offload support for better performance.

- Fix for Issue 1 : Remove checks which prevent enabling TCP largesend and
   checksum offloads.
- Fix for Issue 2 : When ibmveth receives a packet with "checksum good"
   bit set and if its configured in Trunk mode, set appropriate SKB fields
   using skb_partial_csum_set (ip_summed field is set with
   CHECKSUM_PARTIAL)
- Fix for Issue 3: Recompute the pseudo header checksum before sending the
   SKB up the stack.
- Fix for Issue 4: Linearize the SKBs with frag_list. Though we end up
   allocating buffers and copying data, this fix gives
   upto 4X throughput increase.

Note: All these fixes need to be dropped together as fixing just one of
them will lead to other issues immediately (especially for Issues 1,2 & 3).

Signed-off-by: Sivakumar Krishnasamy <ksiva@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 66aa0678efc29abd2ab02a09b23f9a8bc9f12a6c)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

bonding: avoid NETDEV_CHANGEMTU event when unregistering slave

BugLink: http://bugs.launchpad.net/bugs/1704102
As Hongjun/Nicolas summarized in their original patch:

"
When a device changes from one netns to another, it's first unregistered,
then the netns reference is updated and the dev is registered in the new
netns. Thus, when a slave moves to another netns, it is first
unregistered. This triggers a NETDEV_UNREGISTER event which is caught by
the bonding driver. The driver calls bond_release(), which calls
dev_set_mtu() and thus triggers NETDEV_CHANGEMTU (the device is still in
the old netns).
"

This is a very special case, because the device is being unregistered
no one should still care about the NETDEV_CHANGEMTU event triggered
at this point, we can avoid broadcasting this event on this path,
and avoid touching inetdev_event()/addrconf_notify() path.

It requires to export __dev_set_mtu() to bonding driver.

Reported-by: Hongjun Li <hongjun.li@6wind.com>
Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f51048c3e07b68c90b21a77541fc4b208f9244d7)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

drivers: net: xgene: Fix redundant prefetch buffer cleanup

BugLink: http://bugs.launchpad.net/bugs/1693673
Prefetch buffer cleanup code was called twice, causing EDAC to
report errors during reboot.

[ 1130.972475] xgene-edac 78800000.edac: IOB bridge agent (BA) transaction
error
[ 1130.979584] xgene-edac 78800000.edac: IOB BA write response error
[ 1130.985648] xgene-edac 78800000.edac: IOB BA write access at 0x00.00000000
()
[ 1130.993612] xgene-edac 78800000.edac: IOB BA requestor ID 0x00002400
[ 1131.000242] xgene-edac 78800000.edac: IOB bridge agent (BA) transaction
error
...

This patch fixes the errors by,

- removing the redundant prefetch buffer cleanup from port_ops->shutdown()
- moving port_ops->shutdown() after delete_rings()

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 8aba8474181070a30f56ffd19359f5d80665175e)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

blk-mq: NVMe 512B/4K+T10 DIF/DIX format returns I/O error on dd with split op

BugLink: http://bugs.launchpad.net/bugs/1689946
When formatting NVMe to 512B/4K + T10 DIf/DIX, dd with split op returns
"Input/output error". Looks block layer split the bio after calling
bio_integrity_prep(bio). This patch fixes the issue.

Below is how we debug this issue:
(1)format nvme to 4K block # size with type 2 DIF
(2)dd with block size bigger than 1024k.
oflag=direct
dd: error writing '/dev/nvme0n1': Input/output error

We added some debug code in nvme device driver. It showed us the first
op and the second op have the same bi and pi address. This is not
correct.

1st op: nvme0n1 Op:Wr slba 0x505 length 0x100, PI ctrl=0x1400,
dsmgmt=0x0, AT=0x0 & RT=0x505
Guard 0x00b1, AT 0x0000, RT physical 0x00000505 RT virtual 0x00002828

2nd op: nvme0n1 Op:Wr slba 0x605 length 0x1, PI ctrl=0x1400, dsmgmt=0x0,
AT=0x0 & RT=0x605 ==> This op fails and subsequent 5 retires..
Guard 0x00b1, AT 0x0000, RT physical 0x00000605 RT virtual 0x00002828

With the fix, It showed us both of the first op and the second op have
correct bi and pi address.

1st op: nvme2n1 Op:Wr slba 0x505 length 0x100, PI ctrl=0x1400,
dsmgmt=0x0, AT=0x0 & RT=0x505
Guard 0x5ccb, AT 0x0000, RT physical 0x00000505 RT virtual
0x00002828
2nd op: nvme2n1 Op:Wr slba 0x605 length 0x1, PI ctrl=0x1400, dsmgmt=0x0,
AT=0x0 & RT=0x605
Guard 0xab4c, AT 0x0000, RT physical 0x00000605 RT virtual
0x00003028

Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit f36ea50ca0043e7b1204feaf1d2ba6bd68c08d36)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

bonding: fix 802.3ad support for 5G and 50G speeds

BugLink: http://bugs.launchpad.net/bugs/1697892
This patch adds [5|50] Gbps enum definition, and fixes
aggregated bandwidth calculation based on above slave links.

Fixes: c9a70d43461d ("net-next: ethtool: Added port speed macros.")
Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c7c550670afda2e16f9e2d06a1473885312eb6b5)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

bonding: add 802.3ad support for 25G speeds

BugLink: http://bugs.launchpad.net/bugs/1697892
Cut-n-paste enablement of 802.3ad bonding on 25G NICs, which currently
report 0 as their bandwidth.

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 19ddde1eeca1ee81f4add5e04da66055e09281ac)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling

Add VM_FAULT_HWPOISON[_LARGE] handling to the arm64 page fault
handler. Handling of VM_FAULT_HWPOISON[_LARGE] is very similar
to VM_FAULT_OOM, the only difference is that a different si_code
(BUS_MCEERR_AR) is passed to user space and si_addr_lsb field is
initialized.

BugLink: http://launchpad.net/bugs/1696852
Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
(fix new __do_user_fault call-site)
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Tested-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit e7c600f149b89e06073ab50f4f12e79828d3d2f0)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

arm64: kconfig: allow support for memory failure handling

Declare ARCH_SUPPORTS_MEMORY_FAILURE, as arm64 does support
memory failure recovery attempt.

BugLink: http://launchpad.net/bugs/1696852
Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
(Dropped changes to ACPI APEI Kconfig and updated commit log)
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Tested-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit c484f2564db12fa2b01b198ecb6ff0751a3e5e32)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries

When memory failure is enabled, a poisoned hugepage pte is marked as a
swap entry. huge_pte_offset() does not return the poisoned page table
entries when it encounters PUD/PMD hugepages.

This behaviour of huge_pte_offset() leads to error such as below when
munmap is called on poisoned hugepages.

[ 344.165544] mm/pgtable-generic.c:33: bad pmd 000000083af00074.

Fix huge_pte_offset() to return the poisoned pte which is then
appropriately handled by the generic layer code.

BugLink: http://launchpad.net/bugs/1696852
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Woods <dwoods@mellanox.com>
Tested-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit f02ab08afbe76ee7b0b2a34a9970e7dd200d8b01)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

arm64: mm: Update perf accounting to handle poison faults

Re-organise the perf accounting for fault handling in preparation for
enabling handling of hardware poison faults in subsequent commits. The
change updates perf accounting to be inline with the behaviour on
x86.

With this update, the perf fault accounting -

  * Always report PERF_COUNT_SW_PAGE_FAULTS

  * Doesn't report anything else for VM_FAULT_ERROR (which includes
    hwpoison faults)

  * Reports PERF_COUNT_SW_PAGE_FAULTS_MAJ if it's a major
    fault (indicated by VM_FAULT_MAJOR)

  * Otherwise, reports PERF_COUNT_SW_PAGE_FAULTS_MIN

BugLink: http://launchpad.net/bugs/1696852
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 0e3a9026396cd7d0eb5777ba923a8f30a3d9db19)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

UBUNTU: [Config] CONFIG_ACPI_APEI_SEA=y

BugLink: https://launchpad.net/bugs/1696570
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

acpi: apei: check for pending errors when probing GHES entries

Check for pending errors when probing GHES entries. It is possible
that a fatal error is already pending at this point, so we should
handle it as soon as the driver is probed. This also avoids a
potential issue if there was an interrupt that was already
cleared for an error since the GHES driver wasn't present.

BugLink: https://launchpad.net/bugs/1696570
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 77b246b32b2c4bc21e352dcb8b53a8aba81ee5a4)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

arm/arm64: KVM: add guest SEA support

Currently external aborts are unsupported by the guest abort
handling. Add handling for SEAs so that the host kernel reports
SEAs which occur in the guest kernel.

When an SEA occurs in the guest kernel, the guest exits and is
routed to kvm_handle_guest_abort(). Prior to this patch, a print
message of an unsupported FSC would be printed and nothing else
would happen. With this patch, the code gets routed to the APEI
handling of SEAs in the host kernel to report the SEA information.

BugLink: https://launchpad.net/bugs/1696570
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(backported from commit 621f48e40ee9b0100a802531069166d7d94796e0)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

ras: mark stub functions as 'inline'

With CONFIG_RAS disabled, we get two harmless warnings about
unused functions:

include/linux/ras.h:37:13: error: 'log_arm_hw_error' defined but not used [-Werror=unused-function]
static void log_arm_hw_error(struct cper_sec_proc_arm *err) { return; }
include/linux/ras.h:33:13: error: 'log_non_standard_event' defined but not used [-Werror=unused-function]
static void log_non_standard_event(const guid_t *sec_type,

Clearly these are meant to be 'inline', like the other stubs
in the same header.

Fixes: 297b64c74385 ("ras: acpi / apei: generate trace event for unrecognized CPER section")
Fixes: e9279e83ad1f ("trace, ras: add ARM processor error trace event")
BugLink: https://launchpad.net/bugs/1696570
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(backported from commit 0607512d0a8d7fac86667466b884095e04b10a59)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

trace, ras: add ARM processor error trace event

Currently there are trace events for the various RAS
errors with the exception of ARM processor type errors.
Add a new trace event for such errors so that the user
will know when they occur. These trace events are
consistent with the ARM processor error section type
defined in UEFI 2.6 spec section N.2.4.4.

BugLink: https://launchpad.net/bugs/1696570
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(backported from commit e9279e83ad1f4b5af541a522a81888f828210b40)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

ras: acpi / apei: generate trace event for unrecognized CPER section

The UEFI spec includes non-standard section type support in the
Common Platform Error Record. This is defined in section N.2.3 of
UEFI version 2.5.

Currently if the CPER section's type (UUID) does not match any
section type that the kernel knows how to parse, a trace event is
not generated.

Generate a trace event which contains the raw error data for
non-standard section type error records.

BugLink: http://launchpad.net/bugs/1696570
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Tested-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(backported from commit 297b64c74385fc7ea5dfff66105ab6465f2df49a)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

efi: print unrecognized CPER section

UEFI spec allows for non-standard section in Common Platform Error
Record. This is defined in section N.2.3 of UEFI version 2.5.

Currently if the CPER section's type (UUID) does not match with
one of the section types that the kernel knows how to parse, the
section is skipped. Therefore, user is not able to see
such CPER data, for instance, error record of non-standard section.

This change prints out the raw data in hex in the dmesg buffer so
that non-standard sections are reported to the user. Non-standard
section type errors should be reported to the user because these
can include errors which are vendor specific. The data length is
taken from Error Data length field of Generic Error Data Entry.

The following is a sample output from dmesg:
Hardware error from APEI Generic Hardware Error Source: 2
It has been corrected by h/w and requires no further action
event severity: corrected
  time: precise 2017-03-15 20:37:35
  Error 0, type: corrected
   section type: unknown, d2e2621c-f936-468d-0d84-15a4ed015c8b
   section length: 0x238
   00000000: 4d415201 4d492031 453a4d45 435f4343  .RAM1 IMEM:ECC_C
   00000010: 53515f45 44525f42 00000000 00000000  E_QSB_RD........
   00000020: 00000000 00000000 00000000 00000000  ................
   00000030: 00000000 00000000 01010000 01010000  ................
   00000040: 00000000 00000000 00000005 00000000  ................
   00000050: 01010000 00000000 00000001 00dddd00  ................
...

The raw data from the error can then be decoded using vendor
specific tools.

BugLink: https://launchpad.net/bugs/1696570
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 0fc300f414519b10c146fc3329a1b3094e4b6d52)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

acpi: apei: panic OS with fatal error status block

Even if an error status block's severity is fatal, the kernel does not
honor the severity level and panic.

With the firmware first model, the platform could inform the OS about a
fatal hardware error through the non-NMI GHES notification type. The OS
should panic when a hardware error record is received with this
severity.

Call panic() after CPER data in error status block is printed if
severity is fatal, before each error section is handled.

BugLink: https://launchpad.net/bugs/1696570
Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 2fb5853e4442334cb66fc2ab33a51c91d4434769)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

acpi: apei: handle SEA notification type for ARMv8

ARM APEI extension proposal added SEA (Synchronous External Abort)
notification type for ARMv8.
Add a new GHES error source handling function for SEA. If an error
source's notification type is SEA, then this function can be registered
into the SEA exception handler. That way GHES will parse and report
SEA exceptions when they occur.
An SEA can interrupt code that had interrupts masked and is treated as
an NMI. To aid this the page of address space for mapping APEI buffers
while in_nmi() is always reserved, and ghes_ioremap_pfn_nmi() is
changed to use the helper methods to find the prot_t to map with in
the same way as ghes_ioremap_pfn_irq().

BugLink: http://launchpad.net/bugs/1696570
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Reviewed-by: James Morse <james.morse@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 7edda0886bc3d1e5418951558a2555af1bc73b0a)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

arm64: exception: handle Synchronous External Abort

SEA exceptions are often caused by an uncorrected hardware
error, and are handled when data abort and instruction abort
exception classes have specific values for their Fault Status
Code.
When SEA occurs, before killing the process, report the error
in the kernel logs.
Update fault_info[] with specific SEA faults so that the
new SEA handler is used.

BugLink: http://launchpad.net/bugs/1696570
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Reviewed-by: James Morse <james.morse@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
[will: use NULL instead of 0 when assigning si_addr]
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 32015c235603a209697d1228667b1fb1cc8a8d06)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

efi: parse ARM processor error

Add support for ARM Common Platform Error Record (CPER).
UEFI 2.6 specification adds support for ARM specific
processor error information to be reported as part of the
CPER records. This provides more detail on for processor error logs.

BugLink: http://launchpad.net/bugs/1696570
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Reviewed-by: James Morse <james.morse@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 2f74f09bce4f8d0236f20174a6daae63e10fe733)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

cper: add timestamp print to CPER status printing

The ACPI 6.1 spec added a timestamp to the generic error data
entry structure. Print the timestamp out when printing out the
error information.

BugLink: http://launchpad.net/bugs/1696570
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 8a94471fb73fd53589746561a2dba29e073ed829)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

ras: acpi/apei: cper: add support for generic data v3 structure

The ACPI 6.1 spec adds a new revision of the generic error data
entry structure. Add support to handle the new structure as well
as properly verify and iterate through the generic data entries.

BugLink: http://launchpad.net/bugs/1696570
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(backported from commit bbcc2e7b642ed241651fee50ac6ed59643cb1736)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

acpi: apei: read ack upon ghes record consumption

A RAS (Reliability, Availability, Serviceability) controller
may be a separate processor running in parallel with OS
execution, and may generate error records for consumption by
the OS. If the RAS controller produces multiple error records,
then they may be overwritten before the OS has consumed them.

The Generic Hardware Error Source (GHES) v2 structure
introduces the capability for the OS to acknowledge the
consumption of the error record generated by the RAS
controller. A RAS controller supporting GHESv2 shall wait for
the acknowledgment before writing a new error record, thus
eliminating the race condition.

Add support for parsing of GHESv2 sub-tables as well.

BugLink: https://launchpad.net/bugs/1696570
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 42aa560446622da6c33dce54fc8f4cd81516e906)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

UBUNTU: Start new release

Ignore: yes
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

UBUNTU: Ubuntu-4.10.0-32.36

Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

udp: consistently apply ufo or fragmentation

When iteratively building a UDP datagram with MSG_MORE and that
datagram exceeds MTU, consistently choose UFO or fragmentation.

Once skb_is_gso, always apply ufo. Conversely, once a datagram is
split across multiple skbs, do not consider ufo.

Sendpage already maintains the first invariant, only add the second.
IPv6 does not have a sendpage implementation to modify.

Fixes: e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
CVE-2017-1000112

(backported from email submission)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

net: account for current skb length when deciding about UFO

Our customer encountered stuck NFS writes for blocks starting at specific
offsets w.r.t. page boundary caused by networking stack sending packets via
UFO enabled device with wrong checksum. The problem can be reproduced by
composing a long UDP datagram from multiple parts using MSG_MORE flag:

  sendto(sd, buff, 1000, MSG_MORE, ...);
  sendto(sd, buff, 1000, MSG_MORE, ...);
  sendto(sd, buff, 3000, 0, ...);

Assume this packet is to be routed via a device with MTU 1500 and
NETIF_F_UFO enabled. When second sendto() gets into __ip_append_data(),
this condition is tested (among others) to decide whether to call
ip_ufo_append_data():

  ((length + fragheaderlen) > mtu) || (skb && skb_is_gso(skb))

At the moment, we already have skb with 1028 bytes of data which is not
marked for GSO so that the test is false (fragheaderlen is usually 20).
Thus we append second 1000 bytes to this skb without invoking UFO. Third
sendto(), however, has sufficient length to trigger the UFO path so that we
end up with non-UFO skb followed by a UFO one. Later on, udp_send_skb()
uses udp_csum() to calculate the checksum but that assumes all fragments
have correct checksum in skb->csum which is not true for UFO fragments.

When checking against MTU, we need to add skb->len to length of new segment
if we already have a partially filled skb and fragheaderlen only if there
isn't one.

In the IPv6 case, skb can only be null if this is the first segment so that
we have to use headersize (length of the first IPv6 header) rather than
fragheaderlen (length of IPv6 header of further fragments) for skb == NULL.

Fixes: e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")
Fixes: e4c5e13aa45c ("ipv6: Should use consistent conditional judgement for
ip6 fragment between __ip6_append_data and ip6_finish_output")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CVE-2017-1000112

(cherry-picked from commit a5cb659bbc1c8644efa0c3138a757a1e432a4880)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

udp: avoid ufo handling on IP payload compression packets

commit c146066ab802 ("ipv4: Don't use ufo handling on later transformed
packets") and commit f89c56ce710a ("ipv6: Don't use ufo handling on
later transformed packets") added a check that 'rt->dst.header_len' isn't
zero in order to skip UFO, but it doesn't include IPcomp in transport mode
where it equals zero.

Packets, after payload compression, may not require further fragmentation,
and if original length exceeds MTU, later compressed packets will be
transmitted incorrectly. This can be reproduced with LTP udp_ipsec.sh test
on veth device with enabled UFO, MTU is 1500 and UDP payload is 2000:

* IPv4 case, offset is wrong + unnecessary fragmentation
    udp_ipsec.sh -p comp -m transport -s 2000 &
    tcpdump -ni ltp_ns_veth2
    ...
    IP (tos 0x0, ttl 64, id 45203, offset 0, flags [+],
      proto Compressed IP (108), length 49)
      10.0.0.2 > 10.0.0.1: IPComp(cpi=0x1000)
    IP (tos 0x0, ttl 64, id 45203, offset 1480, flags [none],
      proto UDP (17), length 21) 10.0.0.2 > 10.0.0.1: ip-proto-17

* IPv6 case, sending small fragments
    udp_ipsec.sh -6 -p comp -m transport -s 2000 &
    tcpdump -ni ltp_ns_veth2
    ...
    IP6 (flowlabel 0x6b9ba, hlim 64, next-header Compressed IP (108)
      payload length: 37) fd00::2 > fd00::1: IPComp(cpi=0x1000)
    IP6 (flowlabel 0x6b9ba, hlim 64, next-header Compressed IP (108)
      payload length: 21) fd00::2 > fd00::1: IPComp(cpi=0x1000)

Fix it by checking 'rt->dst.xfrm' pointer to 'xfrm_state' struct, skip UFO
if xfrm is set. So the new check will include both cases: IPcomp and IPsec.

Fixes: c146066ab802 ("ipv4: Don't use ufo handling on later transformed packets")
Fixes: f89c56ce710a ("ipv6: Don't use ufo handling on later transformed packets")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CVE-2017-1000112

(cherry-picked from commit 4b3b45edba9222e518a1ec72df841eba3609fe34)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

net-packet: fix race in packet_set_ring on PACKET_RESERVE

PACKET_RESERVE reserves headroom in memory mapped packet ring frames.
The value po->tp_reserve must is verified to be safe in packet_set_ring

  if (unlikely(req->tp_frame_size < po->tp_hdrlen + po->tp_reserve))

and the setsockopt fails once a ring is set.

  if (po->rx_ring.pg_vec || po->tx_ring.pg_vec)
          return -EBUSY;

This operation does not take the socket lock. This leads to a race
similar to the one with PACKET_VERSION fixed in commit 84ac7260236a
("packet: fix race condition in packet_set_ring").

Fix this issue in the same manner: take the socket lock, which as of
that patch is held for the duration of packet_set_ring.

This bug was discovered with syzkaller.

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
CVE-2017-1000111

(backported from email submission)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

UBUNTU: Start new release

Ignore: yes
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

UBUNTU: Ubuntu-4.10.0-30.34

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

dentry name snapshots

take_dentry_name_snapshot() takes a safe snapshot of dentry name;
if the name is a short one, it gets copied into caller-supplied
structure, otherwise an extra reference to external name is grabbed
(those are never modified). In either case the pointer to stable
string is stored into the same structure.

dentry must be held by the caller of take_dentry_name_snapshot(),
but may be freely dropped afterwards - the snapshot will stay
until destroyed by release_dentry_name_snapshot().

Intended use:
struct name_snapshot s;

take_dentry_name_snapshot(&s, dentry);
...
access s.name
...
release_dentry_name_snapshot(&s);

Replaces fsnotify_oldname_...(), gets used in fsnotify to obtain the name
to pass down with event.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 49d31c2f389acfe83417083e1208422b4091cd9e)
CVE-2017-7533
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

UBUNTU: Start new release

Ignore: yes
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

UBUNTU: Ubuntu-4.10.0-29.33

Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

powerpc/powernv: Fix boot on Power8 bare metal due to opal_configure_cores()

In commit 1c0eaf0f56d6 ("powerpc/powernv: Tell OPAL about our MMU mode
on POWER9"), we added additional flags to the OPAL call to configure
CPUs at boot.

These flags only work on Power9 firmwares, and worse can cause boot
failures on Power8 machines, so we check for CPU_FTR_ARCH_300 (aka POWER9)
before adding the extra flags.

Unfortunately we forgot that opal_configure_cores() is called before
the CPU feature checks are dynamically patched, meaning the check
always returns true.

We definitely need to do something to make the CPU feature checks less
prone to bugs like this, but for now the minimal fix is to use
early_cpu_has_feature().

Reported-and-tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Fixes: 1c0eaf0f56d6 ("powerpc/powernv: Tell OPAL about our MMU mode on POWER9")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
BugLink: http://bugs.launchpad.net/bugs/1702159
(cherry-picked from commit a70b487b07cf4201bc6702e7f646fa593b23009f linux-next)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

mm/mmap.c: expand_downwards: don't require the gap if !vm_prev

expand_stack(vma) fails if address < stack_guard_gap even if there is no
vma->vm_prev. I don't think this makes sense, and we didn't do this
before the recent commit 1be7107fbe18 ("mm: larger stack guard gap,
between vmas").

We do not need a gap in this case, any address is fine as long as
security_mmap_addr() doesn't object.

This also simplifies the code, we know that address >= prev->vm_end and
thus underflow is not possible.

Link: http://lkml.kernel.org/r/20170628175258.GA24881@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CVE-2017-1000364

(cherry picked from commit 32e4e6d5cbb0c0e427391635991fe65e17797af8)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack

Commit 1be7107fbe18 ("mm: larger stack guard gap, between vmas") has
introduced a regression in some rust and Java environments which are
trying to implement their own stack guard page.  They are punching a new
MAP_FIXED mapping inside the existing stack Vma.

This will confuse expand_{downwards,upwards} into thinking that the
stack expansion would in fact get us too close to an existing non-stack
vma which is a correct behavior wrt safety.  It is a real regression on
the other hand.

Let's work around the problem by considering PROT_NONE mapping as a part
of the stack.  This is a gros hack but overflowing to such a mapping
would trap anyway an we only can hope that usespace knows what it is
doing and handle it propely.

Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas")
Link: http://lkml.kernel.org/r/20170705182849.GA18027@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Debugged-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CVE-2017-1000364

(cherry picked from commit 561b5e0709e4a248c67d024d4d94b6e31e3edf2f)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

powerpc/powernv: Tell OPAL about our MMU mode on POWER9

BugLink: http://bugs.launchpad.net/bugs/1702159
That will allow OPAL to configure the CPU in an optimal way.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from 1c0eaf0f56d6128af7f0f252855173fcee85d202)
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

UBUNTU: [Debian] Support custom and lts kernels in printchanges/insertchanges

Ignore: yes

Currently printchanges/insertchanges do not work for custom kernels
because commit messages for each release follow the format
"UBUNTU: Ubuntu-${flavour}-${prev_fullver}" instead of
"UBUNTU: Ubuntu-${prev_fullver}". Also, for the first release, the
previous version in the changelog does not match the version in the
previous release commit.

This patch makes the base commit selection more flexible, allowing
commit messages in the format "UBUNTU: Ubuntu-*${prev_fullver}" and it
fallbacks to the latest release commit when a exact match is not found
in order to support the custom kernels in their initial releases.

Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

nvme: Quirks for PM1725 controllers

BugLink: http://bugs.launchpad.net/bugs/1704435
PM1725 controllers have a couple of quirks that need to be handled in
the driver:

- I/O queue depth must be limited to 64 entries on controllers that do
   not report MQES.

- The host interface registers go offline briefly while resetting the
   chip. Thus a delay is needed before checking whether the controller
   is ready.

Note that the admin queue depth is also limited to 64 on older versions
of this board. Since our NVME_AQ_DEPTH is now 32 that is no longer an
issue.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
(backported from d554b5e1ca64d23e4f839e6531490fee8479fbaf)
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>

net: hns: Bugfix for Tx timeout handling in hns driver

BugLink: https://bugs.launchpad.net/bugs/1704146
When hns port type is not debug mode, netif_tx_disable is called
when there is a tx timeout, which requires system reboot to return
to normal state. This patch fix this problem by resetting the net
dev.

Fixes: b5996f11ea54 ("net: add Hisilicon Network Subsystem basic ethernet support")
Signed-off-by: Lin Yun Sheng <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 76b825ab870be3281edac4ae8a414da6e54b0d3a)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

iommu/arm-smmu: Plumb in new ACPI identifiers

BugLink: https://bugs.launchpad.net/bugs/1703437
Revision C of IORT now allows us to identify ARM MMU-401 and the Cavium
ThunderX implementation. Wire them up so that we can probe these models
once firmware starts using the new codes in place of generic ones, and
so that the appropriate features and quirks get enabled when we do.

For the sake of backports and mitigating sychronisation problems with
the ACPICA headers, we'll carry a backup copy of the new definitions
locally for the short term to make life simpler.

CC: stable@vger.kernel.org # 4.10
Acked-by: Robert Richter <rrichter@cavium.com>
Tested-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 84c24379a783c514e5ff7c8fc8a21cf8d64fd05f)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

rxrpc: Fix several cases where a padded len isn't checked in ticket decode

This fixes CVE-2017-7482.

When a kerberos 5 ticket is being decoded so that it can be loaded into an
rxrpc-type key, there are several places in which the length of a
variable-length field is checked to make sure that it's not going to
overrun the available data - but the data is padded to the nearest
four-byte boundary and the code doesn't check for this extra. This could
lead to the size-remaining variable wrapping and the data pointer going
over the end of the buffer.

Fix this by making the various variable-length data checks use the padded
length.

Reported-by: 石磊 <shilei-c@360.cn>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.c.dionne@auristor.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CVE-2017-7482

(cherry-picked from commit 5f2f97656ada8d811d3c1bef503ced266fcd53a0)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

fs/exec.c: account for argv/envp pointers

When limiting the argv/envp strings during exec to 1/4 of the stack limit,
the storage of the pointers to the strings was not included. This means
that an exec with huge numbers of tiny strings could eat 1/4 of the stack
limit in strings and then additional space would be later used by the
pointers to the strings.

For example, on 32-bit with a 8MB stack rlimit, an exec with 1677721
single-byte strings would consume less than 2MB of stack, the max (8MB /
4) amount allowed, but the pointers to the strings would consume the
remaining additional stack space (1677721 * 4 == 6710884).

The result (1677721 + 6710884 == 8388605) would exhaust stack space
entirely. Controlling this stack exhaustion could result in
pathological behavior in setuid binaries (CVE-2017-1000365).

[akpm@linux-foundation.org: additional commenting from Kees]
Fixes: b6a2fea39318 ("mm: variable length argument support")
Link: http://lkml.kernel.org/r/20170622001720.GA32173@beast
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qualys Security Advisory <qsa@qualys.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CVE-2017-1000365

(cherry-picked from commit 98da7d08850fb8bdeb395d6368ed15753304aa0c)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

drm/virtio: don't leak bo on drm_gem_object_init failure

Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170406155941.458-1-kraxel@redhat.com
CVE-2017-10810

(cherry picked from commit 385aee965b4e4c36551c362a334378d2985b722a)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

UBUNTU: SAUCE: hio: Fix incorrect use of enum req_opf values

BugLink: http://bugs.launchpad.net/bugs/1701316
Patch from Huawei to fix incorrect use of enumerated values for
bio operations as bitmasks. A reordering of the enum in 4.10
caused a change in behavior which has been leading to data
corruption.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

arm64: mm: select CONFIG_ARCH_PROC_KCORE_TEXT

BugLink: https://bugs.launchpad.net/bugs/1702749
To avoid issues with the /proc/kcore code getting confused about the
kernels block mappings in the VMALLOC region, enable the existing
facility that describes the [_text, _end) interval as a separate
KCORE_TEXT region, which supersedes the KCORE_VMALLOC region that
it intersects with on arm64.

Reported-by: Tan Xiaojun <tanxiaojun@huawei.com>
Tested-by: Tan Xiaojun <tanxiaojun@huawei.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 8f36094802e4e6de180b36bcac4cfd9d319e1b64)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

fs/proc: kcore: use kcore_list type to check for vmalloc/module address

BugLink: https://bugs.launchpad.net/bugs/1702749
Instead of passing each start address into is_vmalloc_or_module_addr()
to decide whether it falls into either the VMALLOC or the MODULES region,
we can simply check the type field of the current kcore_list entry, since
it will be set to KCORE_VMALLOC based on exactly the same conditions.

As a bonus, when reading the KCORE_TEXT region on architectures that have
one, this will avoid using vread() on the region if it happens to intersect
with a KCORE_VMALLOC region. This is due the fact that the KCORE_TEXT
region is the first one to be added to the kcore region list.

Reported-by: Tan Xiaojun <tanxiaojun@huawei.com>
Tested-by: Tan Xiaojun <tanxiaojun@huawei.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 737326aa510b5f7d2f38ded739914a9d5e4e4cea)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Update debug prints in reset handlers

BugLink: http://bugs.launchpad.net/bugs/1702521
The device and host reset handler contain debug prints to help identify
the entities being reset. Today these reset handlers are based on a SCSI
EH design that uses a SCSI command reference as a means of identifying
the target entity. As such, the debug trace includes the SCSI command
pointer and associated CDB. This is not necessary as the SCSI command is
simply the messenger in these scenarios.

Refactor the debug prints in the host and reset handlers to only present
information that is applicable given the function scope.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 5a4d9d7790422c4a92d8ca52e37c1e2b45d42c27)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Update send_tmf() parameters

BugLink: http://bugs.launchpad.net/bugs/1702521
The current send_tmf() implementation is based on the caller providing a
SCSI command reference. In reality all that is needed is a SCSI device
reference as the routine uses a private command.

Refactor send_tmf() to pass the private adapter configuration reference
and a SCSI device reference. As a nice side effect, this will ease the
burden of converting caller routines to be based solely off of a SCSI
device reference.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 32abbedaafde5a0c1edfd07369dde73a4fda2554)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Avoid double free of character device

BugLink: http://bugs.launchpad.net/bugs/1702521
The device_unregister() service used when cleaning up the character
device is already responsible for the internal state associated with the
device upon successful creation. As the cxlflash driver does not obtain
a second reference to the character device, the explicit call to
put_device() is not required and can lead to an inconsistent sysfs among
other issues as the reference is no longer valid after the first
put_device() is performed.

Remove the unnecessary put_device() to remedy this issue.

Fixes: a834a36b57d9 ("scsi: cxlflash: Create character device to provide host management interface")
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit acfeb23b29894deaee65d63c55bea09183f6b538)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Update TMF command processing

BugLink: http://bugs.launchpad.net/bugs/1702521
Currently, the SCSI command presented to the device reset handler is used
to send TMFs to the AFU for a device reset. This behavior is incorrect as
the command presented is an actual command and not a special notification.
As such, it should only be used for reference and not be acted upon.

Additionally, the existing TMF transmission routine does not account for
actual errors from the hardware, only reflecting failure when a timeout
occurs. This can lead to a condition where the device reset handler is
presented with a false 'success'.

Update send_tmf() to dynamically allocate a private command for sending
the TMF command and properly reflect failure when the completed command
indicates an error or was aborted. Detect TMF commands during response
processing and avoid scsi_done() for these types of commands. Lastly,
update comments in the TMF processing paths to describe the new behavior.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 8ba1ddb31f528cb45be39b7f3b600261afaa7920)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Remove zeroing of private command data

BugLink: http://bugs.launchpad.net/bugs/1702521
The SCSI core now zeroes the per-command private data area prior to
calling into the LLD. Replace the clearing operation that takes place
when the private command data reference is obtained with a routine that
performs common initializations. The zeroing that takes place in the
device reset path remains intact as the private command data associated
with the specified SCSI command is not guaranteed to be cleared.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 479ad8e9d48c4d82c92417b012193e967fc33b8a)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Support WS16 unmap

BugLink: http://bugs.launchpad.net/bugs/1702521
The cxlflash driver supports performing a write-same16 to scrub virtual
luns when they are released by a user. To date, AFUs for adapters that
are supported by cxlflash do not have the capability to unmap as part of
the WS operation. This can lead to fragmented flash devices which results
in performance degradation.

Future AFUs can optionally support unmap write-same commands and reflects
this support via the context control register. This provides userspace
applications with direct visibility such that they need not depend on a
host API.

Detect unmap support during cxlflash initialization by reading the context
control register associated with the primary hardware queue. Update the
existing write_same16() routine to set the unmap bit in the CDB when unmap
is supported by the host.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 3223c01aa1cec60d59bd218aca5e202b558d225a)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Support AFU debug

BugLink: http://bugs.launchpad.net/bugs/1702521
Adopt the SISLite AFU debug capability to allow future CXL Flash
adapters the ability to better debug AFU issues. Update the SISLite
header with the changes necessary to support AFU debug operations
and create a host ioctl interface for user debug software. Also
update the cxlflash documentation to describe this new host ioctl.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit bc88ac47d5cb11c7dd9896781f793fae519d53fa)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Support LUN provisioning

BugLink: http://bugs.launchpad.net/bugs/1702521
Adopt the SISLite AFU LUN provisioning capability to allow future CXL
Flash adapters the ability to better manage storage. Update the SISLite
header with the changes necessary to support LUN provision operations
and create a host ioctl interface for user LUN management software. Also
update the cxlflash documentation to describe this new host ioctl.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 9cf43a360450ddd758b0021d1b55f1cc5643b9ed)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Refactor AFU capability checking

BugLink: http://bugs.launchpad.net/bugs/1702521
The existing AFU capability checking infrastructure is closely tied to
the command mode capability bits. In order to support new capabilities,
refactor the existing infrastructure to be more generic.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit efa1c818d3458fe97d8f83f40051518b44183234)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Introduce host ioctl support

BugLink: http://bugs.launchpad.net/bugs/1702521
As staging for supporting various host management functions, add a host
ioctl infrastructure to filter ioctl commands and perform operations that
are common for all host ioctls. Also update the cxlflash documentation to
create a new section for documenting host ioctls.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit d6e32f530df9827070c45b55a6c67dfa8562184c)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Separate AFU internal command handling from AFU sync specifics

BugLink: http://bugs.launchpad.net/bugs/1702521
To date the only supported internal AFU command is AFU sync. The logic
to send an internal AFU command is embedded in the specific AFU sync
handler and would need to be duplicated for new internal AFU commands.

In order to support new internal AFU commands, separate code that is
common for AFU internal commands into a generic transmission routine
and support passing back command status through an IOASA structure.
The first user of this new routine is the existing AFU sync command.
As a cleanup, use a descriptive name for the AFU sync command instead
of a magic number.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit cf2430279006e4afa67dfa4cf952ded38c7ed5b4)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Create character device to provide host management interface

BugLink: http://bugs.launchpad.net/bugs/1702521
The cxlflash driver currently lacks host management interface. Future
devices supported by cxlflash will provide a variety of host-wide
management functions. Examples include LUN provisioning, hardware debug
support, and firmware download.

In order to provide a way to manage the device, a character device will
be created during probe of each adapter. This device will support a set of
ioctls defined in the SISLite specification from which administrators can
manage the adapter.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a834a36b57d93b31f683a5d2cf7d87e3e617cb70)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Add scsi command abort handler

BugLink: http://bugs.launchpad.net/bugs/1702521
To date, CXL flash devices do not support a single command abort operation.
Instead, the SISLite specification provides a context reset operation to
cleanup all pending commands for a given context.

When a context reset is successful, it is guaranteed that the AFU has
aborted all currently pending I/O. This sequence is less invasive than a
device or host reset and can be executed to support scsi command abort
requests. Add eh_abort_handler callback support to process command timeouts
and abort requests.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 7c4c41f172b6d5dda1119ce5f59151bef732a058)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Flush pending commands in cleanup path

BugLink: http://bugs.launchpad.net/bugs/1702521
When the AFU is reset in an error path, pending scsi commands can be
silently dropped without completion or a formal abort. This puts the onus
on the cxlflash driver to notify mid-layer and indicating that the command
can be retried.

Once the card has been quiesced, the hardware send queue lock is acquired
to prevent any data movement while the pending commands are processed.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a1ea04b3ebd9ae5c1cd5bf48be37aba0d93c1acc)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Track pending scsi commands in each hardware queue

BugLink: http://bugs.launchpad.net/bugs/1702521
Currently, there is no book keeping of the pending scsi commands in the
cxlflash driver. This lack of tracking in-flight requests is too
restrictive and requires a heavy-hammer reset each time an adapter error is
encountered. Additionally, it does not allow for commands to be properly
retried.

In order to avoid this problem and to better handle error path command
cleanup, introduce a linked list for each hardware queue that tracks
pending commands.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a002bf830f5df3e622e32fdbde1756bcbb6aedad)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Handle AFU sync failures

BugLink: http://bugs.launchpad.net/bugs/1702521
AFU sync operations are not currently evaluated for failure. This is
acceptable for paths where there is not a dependency on the AFU being
consistent with the host. Examples include link reset events and LUN
cleanup operations. On paths where there is a dependency, such as a LUN
open, a sync failure should be acted upon.

In the event of AFU sync failures, either log or cleanup as appropriate for
operations that are dependent on a successful sync completion.

Update documentation to reflect behavior in the event of an AFU sync
failure.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit c2c292f45029a6850cd14c7c2fa4fc479b8f74aa)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Schedule asynchronous reset of the host

BugLink: http://bugs.launchpad.net/bugs/1702521
A context reset failure indicates the AFU is in a bad state. At present,
when such a situation occurs, no further action is taken. This leaves the
adapter in an unusable state with no recoverable actions.

To avoid this situation, context reset failures will be escalated to a host
reset operation. This will be done asynchronously to allow the acting
thread to return to the user with a failure.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 0b09e711189952ff9d411593a8d74ec12a956c57)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Reset hardware queue context via specified register

BugLink: http://bugs.launchpad.net/bugs/1702521
Per the SISLite specification, context_reset() writes 0x1 to the LSB of the
reset register. When the AFU processes this reset request, it is expected
to clear the bit after reset is complete. The current implementation simply
checks that the entire value read back is not 1, instead of masking off the
LSB and evaluating it for a change to 0. Should the AFU manipulate other
bits during the reset (reading back a value of 0xF for example), successful
completion will be prematurely indicated given the existing logic.

Additionally, in the event that the context reset operation fails, there
does not currently exist a way to provide feedback to the initiator of the
reset. This poses a problem for the rare case that a context reset fails as
the caller will proceed on the assumption that all is well.

To remedy these issues, refactor the context reset routine to only mask off
the LSB when evaluating for success and return status to the caller. Also
update the context reset handler parameters to pass a hardware queue
reference instead of a single command to better reflect that the entire
queue associated with the context is impacted by the reset.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a96851d3372bf8ee7023712163ad3da9a3e30a29)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Update cxlflash_afu_sync() to return errno

BugLink: http://bugs.launchpad.net/bugs/1702521
The cxlflash_afu_sync() routine returns a negative one to indicate any kind
of failure. This makes it impossible to establish why the error occurred.

Update the return codes to clearly indicate the failure cause to the
caller.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 539d890cecee6b5d7304914afc51b7f53150163d)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Combine the send queue locks

BugLink: http://bugs.launchpad.net/bugs/1702521
Currently there are separate spin locks for the two supported I/O queueing
models. This makes it difficult to serialize with paths outside the enqueue
path.

As a design simplification and to support serialization with enqueue
operations, move to only a single lock that is used for enqueueing
regardless of the queueing model.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 66ea9bcc392017b6df465b6f5847f6eac966a801)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Select IRQ_POLL

BugLink: http://bugs.launchpad.net/bugs/1702521
The driver now uses IRQ_POLL and needs to select it to avoid the
following build error.

ERROR: ".irq_poll_complete" [drivers/scsi/cxlflash/cxlflash.ko] undefined!
ERROR: ".irq_poll_sched" [drivers/scsi/cxlflash/cxlflash.ko] undefined!
ERROR: ".irq_poll_disable" [drivers/scsi/cxlflash/cxlflash.ko] undefined!
ERROR: ".irq_poll_init" [drivers/scsi/cxlflash/cxlflash.ko] undefined!

Fixes: cba06e6de403 ("scsi: cxlflash: Implement IRQ polling for RRQ processing")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 07cc1ccfb84320582c9ac389a21cd81df82bc123)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

cxl: Enable PCI device IDs for future IBM CXL adapters

BugLink: http://bugs.launchpad.net/bugs/1702521
Add support for future IBM Coherent Accelerator (CXL) devices
with an IDs of 0x0623 and 0x0628.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 41e20d959e5919c70058369323cefa57428b7aaf)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Introduce hardware queue steering

BugLink: http://bugs.launchpad.net/bugs/1702521
As an enhancement to distribute requests to multiple hardware queues, add the
infrastructure to hash a SCSI command into a particular hardware queue.
Support the following scenarios when deriving which queue to use: single
queue, tagging when SCSI-MQ enabled, and simple hash via CPU ID when SCSI-MQ
is disabled. Rather than altering the existing send API, the derived hardware
queue is stored in the AFU command where it can be used for sending a command
to the chosen hardware queue.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 1dd0c0e4fd02dc5e5bfaf89bd4656aabe4ae3cb3)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Add hardware queues attribute

BugLink: http://bugs.launchpad.net/bugs/1702521
As staging for supporting multiple hardware queues, add an attribute to show
and set the current number of hardware queues for the host. Support specifying
a hard limit or a CPU affinitized value. This will allow the number of
hardware queues to be tuned by a system administrator.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 3065267a80c88d775e8eb34196280e8eee33322f)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Support multiple hardware queues

BugLink: http://bugs.launchpad.net/bugs/1702521
Introduce multiple hardware queues to improve legacy I/O path performance.
Each hardware queue is comprised of a master context and associated I/O
resources. The hardware queues are initially implemented as a static array
embedded in the AFU. This will be transitioned to a dynamic allocation in a
later series to improve the memory footprint of the driver.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit bfc0bab172cabf3bb25c48c4c521b317ff4a909d)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Improve asynchronous interrupt processing

BugLink: http://bugs.launchpad.net/bugs/1702521
The method used to decode asynchronous interrupts involves unnecessary loops
to match up bits that are set with corresponding entries in the asynchronous
interrupt information table. This algorithm is wasteful and does not scale
well as new status bits are supported.

As an improvement, use the for_each_set_bit() service to iterate over the
asynchronous status bits and refactor the information table such that it can
be indexed by bit position.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit e2ef33fa5958c51ebf0c6f18db19fe927e2185fa)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Fix warnings/errors

BugLink: http://bugs.launchpad.net/bugs/1702521
As a general cleanup, address all reasonable checkpatch warnings and
errors. These include enforcement of comment styles and including named
identifiers in function prototypes.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit fcc87e74a987dc9c0c85f53546df944ede76486a)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Fix power-of-two validations

BugLink: http://bugs.launchpad.net/bugs/1702521
Validation statements to enforce assumptions about specific defines are not
being evaluated by the compiler due to the fact that they reside in a routine
that is not used. To activate them, call the routine as part of module
initialization. As an additional, related cleanup, remove the now-defunct
CXLFLASH_NUM_CMDS.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit cd41e18daf1a21fea5a195a5a74c97c6b183c15a)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Remove unnecessary DMA mapping

BugLink: http://bugs.launchpad.net/bugs/1702521
Devices supported by the cxlflash driver are fully coherent and do not require
a bus address mapping. Avoid unnecessary path length by using the virtual
address and length already present in the scatter-gather entry.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 50b787f7235efbd074bbdf4315e0cc261d85b4d7)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Fence EEH during probe

BugLink: http://bugs.launchpad.net/bugs/1702521
An EEH during probe can lead to a crash as the recovery thread races with the
probe thread. To avoid this issue, introduce new states to fence out EEH
recovery until probe has completed. Also ensure the reset wait queue is
flushed during device removal to avoid orphaned threads.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 323e33428ea23bfb1ae5010b18b4540048b2ad51)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Support up to 4 ports

BugLink: http://bugs.launchpad.net/bugs/1702521
Update the driver to allow for future cards with 4 ports.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 1cd7fabc82eb06c834956113ff287f8848811fb8)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: SISlite updates to support 4 ports

BugLink: http://bugs.launchpad.net/bugs/1702521
Update the SISlite header to support 4 ports as outlined in the SISlite
specification. Address fallout from structure renames and refreshed
organization throughout the driver. Determine the number of ports supported by
a card from the global port selection mask register reset value.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 565180723294b06b3e60030033847277b9d6d4bb)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Hide FC internals behind common access routine

BugLink: http://bugs.launchpad.net/bugs/1702521
As staging to support FC-related updates to the SISlite specification,
introduce helper routines to obtain references to FC resources that exist
within the global map. This will allow changes to the underlying global map
structure without impacting existing code paths.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 0aa14887c60c27e3385295ee85f5ac079ae2ffb5)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Remove port configuration assumptions

BugLink: http://bugs.launchpad.net/bugs/1702521
At present, the cxlflash driver only supports hardware with two FC ports. The
code was initially designed with this assumption and is dependent on having
two FC ports - adding more ports will break logic within the driver.

To mitigate this issue, remove the existing port assumptions and transition
the code to support more than two ports. As a side effect, clarify the
interpretation of the DK_CXLFLASH_ALL_PORTS_ACTIVE flag.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 8fa4f1770d56af6f0a5a862f1fd298a4eeea94f3)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Support dynamic number of FC ports

BugLink: http://bugs.launchpad.net/bugs/1702521
Transition from a static number of FC ports to a value that is derived during
probe. For now, a static value is used but this will later be based on the
type of card being configured.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 78ae028e823701148e4915759459ee79597ea8ec)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Update sysfs helper routines to pass config structure

BugLink: http://bugs.launchpad.net/bugs/1702521
As staging for future function, pass the config pointer instead of the AFU
pointer for port-related sysfs helper routines.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 3b225cd32a05b627a6ca366f364a824beaabecc5)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Implement IRQ polling for RRQ processing

BugLink: http://bugs.launchpad.net/bugs/1702521
Currently, RRQ processing takes place on hardware interrupt context. This can
be a heavy burden in some environments due to the overhead encountered while
completing RRQ entries. In an effort to improve system performance, use the
IRQ polling API to schedule this processing on softirq context.

This function will be disabled by default until starting values can be
established for the hardware supported by this driver.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit cba06e6de4038cd44a3e93a92ad982c372b8a14e)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Serialize RRQ access and support offlevel processing

BugLink: http://bugs.launchpad.net/bugs/1702521
As further staging to support processing the HRRQ by other means, access to
the HRRQ needs to be serialized by a disabled lock. This will allow safe
access in other non-hardware interrupt contexts. In an effort to minimize the
period where interrupts are disabled, support is added to queue up commands
harvested from the RRQ such that they can be processed with hardware
interrupts enabled. While this doesn't offer any improvement with processing
on a hardware interrupt it will help when IRQ polling is supported and the
command completions can execute on softirq context.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit f918b4a8e6f8bb59c44045f85d10fd9cc7e5a4c0)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Separate RRQ processing from the RRQ interrupt handler

BugLink: http://bugs.launchpad.net/bugs/1702521
In order to support processing the HRRQ by other means (e.g. polling), the
processing portion of the current RRQ interrupt handler needs to be broken out
into a separate routine. This will allow RRQ processing from places other than
the RRQ hardware interrupt handler.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 76a6ebbeef26b004c36a0c8ee0496bae5428fc31)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Enable PCI device ID for future IBM CXL Flash AFU

BugLink: http://bugs.launchpad.net/bugs/1702521
Add support for a future IBM Coherent Accelerator (CXL) flash AFU with
an ID of 0x0624.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 943445200b049d5179b95297e5372d399c8ab0e2)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Cancel scheduled workers before stopping AFU

BugLink: http://bugs.launchpad.net/bugs/1702521
When processing an AFU asynchronous interrupt, if the action results in an
operation that requires off level processing (a link reset for example),
the worker thread is scheduled. In the meantime a reset event (i.e.: EEH)
could unmap the AFU to recover. This results in an Oops when the worker
thread tries to access the AFU mapping.

[c000000f17e03b90] d000000007cd5978 cxlflash_worker_thread+0x268/0x550
[c000000f17e03c40] c00000000011883c process_one_work+0x1dc/0x680
[c000000f17e03ce0] c000000000118e80 worker_thread+0x1a0/0x520
[c000000f17e03d80] c000000000126174 kthread+0xf4/0x100
[c000000f17e03e30] c00000000000a47c ret_from_kernel_thread+0x5c/0xe0

In an effort to avoid this, a mapcount was introduced in
commit b45cdbaf9f7f ("cxlflash: Resolve oops in wait_port_offline")
but due to the race condition described above, this solution is incomplete.

In order to fully resolve this problem and to simplify things, this commit
removes the mapcount solution. Instead, the scheduled worker thread is
cancelled after interrupts have been disabled and prior to the mapping
being freed.

Fixes: b45cdbaf9f7f ("cxlflash: Resolve oops in wait_port_offline")
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 0df5bef739601f18bffc0d256ae451f239a826bd)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Cleanup prints

BugLink: http://bugs.launchpad.net/bugs/1702521
The usage of prints within the cxlflash driver is inconsistent. This
hinders debug and makes the driver source and log output appear sloppy.

The following cleanups help unify the prints within cxlflash:
- move all prints to dev-* where possible
- transition all hex prints to lowercase
- standardize variable prints in debug output
- derive pointers in a consistent manner
- change int to bool where appropriate
- remove superfluous data from prints and print statements that do not
make sense

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit fb67d44dfbdf85d984b9b40284e90636a3a7b21d)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Support SQ Command Mode

BugLink: http://bugs.launchpad.net/bugs/1702521
The SISLite specification outlines a new queuing model to improve
over the MMIO-based IOARRIN model that exists today. This new model
uses a submission queue that exists in host memory and is shared with
the device. Each entry in the queue is an IOARCB that describes a
transfer request. When requests are submitted, IOARCBs ('current'
position tracked in host software) are populated and the submission
queue tail pointer is then updated via MMIO to make the device aware
of the requests.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 696d0b0c715360ce28fedd3c8b009d3771a5ddeb)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

scsi: cxlflash: Refactor context reset to share reset logic

BugLink: http://bugs.launchpad.net/bugs/1702521
As staging for supporting hardware with different context reset
registers but a similar reset procedure, refactor the existing context
reset routine to move the reset logic to a common routine. This will
allow hardware with a different reset register to leverage existing
code.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 9c7d1ee5f13a7130f6d3df307ec010e9e003fa98)
Signed-off-by: Victor Aoqui <victora@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

ath10k: search SMBIOS for OEM board file extension

BugLink: https://bugs.launchpad.net/bugs/1666742
Board Data File (BDF) is loaded upon driver boot-up procedure. The right
board data file is identified, among others, by device and sybsystem ids.

The problem, however, can occur when the (default) board data file cannot
fulfill with the vendor requirements and it is necessary to use a different
board data file.

To solve the issue QCA uses SMBIOS type 0xF8 to store Board Data File Name
Extension to specify the extension/variant name. The driver will take the
extension suffix into consideration and will load the right (non-default)
board data file if necessary.

If it is unnecessary to use extension board data file, please leave the
SMBIOS field blank and default configuration will be used.

Example:
If a default board data file for a specific board is identified by a string
      "bus=pci,vendor=168c,device=003e,subsystem-vendor=1028,
       subsystem-device=0310"
then the OEM specific data file, if used, could be identified by variant
suffix:
      "bus=pci,vendor=168c,device=003e,subsystem-vendor=1028,
       subsystem-device=0310,variant=DE_1AB"

If board data file name extension is set but board-2.bin does not contain
board data file for the variant, the driver will fallback to the default
board data file not to break backward compatibility.

This was first applied in commit f2593cb1b291 ("ath10k: Search SMBIOS for OEM
board file extension") but later reverted in commit 005c3490e9db ("Revert
"ath10k: Search SMBIOS for OEM board file extension"". This patch is now
otherwise the same as commit f2593cb1b291 except the regression fixed.

Signed-off-by: Waldemar Rymarkiewicz <ext.waldemar.rymarkiewicz@tieto.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
(cherry picked from commit 1657b8f84ed9fc1d2a100671f1d42d6286f20073)
Signed-off-by: Shrirang Bagul <shrirang.bagul@canonical.com>
Acked-by: AceLan Kao <acelan.kao@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

nvme: explicitly disable APST on quirked devices

BugLink: https://bugs.launchpad.net/bugs/1699004
A user reports APST is enabled, even when the NVMe is quirked or with
option "default_ps_max_latency_us=0".

The current logic will not set APST if the device is quirked. But the
NVMe in question will enable APST automatically.

Separate the logic "apst is supported" and "to enable apst", so we can
use the latter one to explicitly disable APST at initialiaztion.

BugLink: https://bugs.launchpad.net/bugs/1699004
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Keith Busch <keith.busch@intel.com>
(backported from commit 76a5af841755a0427229a6a77ca83781d61e5b2a)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

nvme: Add nvme_core.force_apst to ignore the NO_APST quirk

BugLink: https://bugs.launchpad.net/bugs/1699004
We're probably going to be stuck quirking APST off on an over-broad
range of devices for 4.11. Let's make it easy to override the quirk
for testing.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(backported from commit c35e30b4727b390ce7a6dd7ead31335320c2b83e)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

nvme: Display raw APST configuration via DYNAMIC_DEBUG

BugLink: https://bugs.launchpad.net/bugs/1699004
Debugging APST is currently a bit of a pain. This gives optional
simple log messages that describe the APST state.

The easiest way to use this is probably with the nvme_core.dyndbg=+p
module parameter.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit fb0dc3993b537e12ce63511d535ff86efff13c8f)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

UBUNTU: SAUCE: PCI: Support hibmc VGA cards behind a misbehaving HiSilicon bridge

BugLink: https://bugs.launchpad.net/bugs/1698706
The HiSilicon D05 board has some PCI bridges (PCI ID 19e5:1610) that
are not spec-compliant: the VGA Enable bit is set to 0 in hardware
and writes do not change it.

This stops VGA arbitrartion from marking a VGA card behind the bridge
as a boot device, and therefore breaks Xorg auto-configuration.

The hibmc VGA card (PCI ID 19e5:1711) is known to work when behind
these bridges.

Provide a quirk so that this combination of bridge and card is eligible
to be the default VGA card.

This fixes Xorg auto-detection.

Cc: Xinliang Liu <z.liuxinliang@hisilicon.com>
Cc: Rongrong Zou <zourongrong@gmail.com>
Signed-off-by: Daniel Axtens <daniel.axtens@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

powerpc/npu-dma: Remove spurious WARN_ON when a PCI device has no of_node

BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1701272
Commit 4c3b89effc28 ("powerpc/powernv: Add sanity checks to
pnv_pci_get_{gpu|npu}_dev") introduced explicit warnings in
pnv_pci_get_npu_dev() when a PCIe device has no associated device-tree
node. However not all PCIe devices have an of_node and
pnv_pci_get_npu_dev() gets indirectly called at least once for every
PCIe device in the system. This results in spurious WARN_ON()'s so
remove it.

The same situation should not exist for pnv_pci_get_gpu_dev() as any
NPU based PCIe device requires a device-tree node.

Fixes: 4c3b89effc28 ("powerpc/powernv: Add sanity checks to pnv_pci_get_{gpu|npu}_dev")
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 377aa6b0efbaa29cfeecd8b9244641217f9544ca)
Signed-off-by: Breno Leitao <breno.leitao@gmail.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

powerpc/powernv/npu-dma: Add explicit flush when sending an ATSD

BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1701272
NPU2 requires an extra explicit flush to an active GPU PID when
sending address translation shoot downs (ATSDs) to reliably flush the
GPU TLB. This patch adds just such a flush at the end of each sequence
of ATSDs.

We can safely use PID 0 which is always reserved and active on the
GPU. PID 0 is only used for init_mm which will never be a user mm on
the GPU. To enforce this we add a check in pnv_npu2_init_context()
just in case someone tries to use PID 0 on the GPU.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
[mpe: Use true/false for bool literals]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit bbd5ff50afffcf4a01d05367524736c57607a478)
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>