]> git.proxmox.com Git - mirror_ubuntu-bionic-kernel.git/log
mirror_ubuntu-bionic-kernel.git
6 years agoscsi: hisi_sas: service interrupt ITCT_CLR interrupt in v2 hw
Xiang Chen [Thu, 10 Aug 2017 16:09:33 +0000 (00:09 +0800)]
scsi: hisi_sas: service interrupt ITCT_CLR interrupt in v2 hw

This patch is a fix related to freeing a device in v2 hw driver.

Before, we polled to ITCT CLR interrupt to check if a device is free.

This was error prone, as if the interrupt doesn't occur in 10us, we miss
processing it.

To avoid this situation, service this interrupt and sync the event with
a completion.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: hisi_sas: add irq and tasklet cleanup in v2 hw
Xiang Chen [Thu, 10 Aug 2017 16:09:32 +0000 (00:09 +0800)]
scsi: hisi_sas: add irq and tasklet cleanup in v2 hw

This patch adds support to clean-up allocated IRQs and kill tasklets
when probe fails and for driver removal.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: hisi_sas: remove repeated device config in v2 hw
Xiang Chen [Thu, 10 Aug 2017 16:09:31 +0000 (00:09 +0800)]
scsi: hisi_sas: remove repeated device config in v2 hw

This patch removes some repeated configurations:

(1) The device id of the device is already set in the alloc function, so
    we don't need to modify in free device function.

(2) Field dev_type and dev_status are configured in hisi_sas_dev_gone(),
    so there is no need for repeated config in free_device_v3_hw.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: hisi_sas: use array for v2 hw ECC errors
John Garry [Thu, 10 Aug 2017 16:09:30 +0000 (00:09 +0800)]
scsi: hisi_sas: use array for v2 hw ECC errors

The code to print ECC errors in v2 hw driver is very repetitive.  This
patch condensed the code by looping an array of errors.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: hisi_sas: add v2 hw DFX feature
Xiaofei Tan [Thu, 10 Aug 2017 16:09:29 +0000 (00:09 +0800)]
scsi: hisi_sas: add v2 hw DFX feature

Add DFX feature for v2 hw. We are adding support for
the following errors:
- loss_of_dword_sync_count
- invalid_dword_count
- phy_reset_problem_count
- running_disparity_error_count

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: hisi_sas: fix v2 hw underflow residual value
Xiang Chen [Thu, 10 Aug 2017 16:09:28 +0000 (00:09 +0800)]
scsi: hisi_sas: fix v2 hw underflow residual value

The value dw0 is the residual bytes when UNDERFLOW error happens, but we
filled the residual with the value of dw3 before. So change the residual
from dw3 to dw0.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: hisi_sas: avoid potential v2 hw interrupt issue
Xiang Chen [Thu, 10 Aug 2017 16:09:27 +0000 (00:09 +0800)]
scsi: hisi_sas: avoid potential v2 hw interrupt issue

When some interrupts happen together, we need to process every interrupt
one-by-one, and should not return immediately when one interrupt process
is finished being processed.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: hisi_sas: fix reset and port ID refresh issues
Xiaofei Tan [Thu, 10 Aug 2017 16:09:26 +0000 (00:09 +0800)]
scsi: hisi_sas: fix reset and port ID refresh issues

This patch provides fixes for the following issues:

1. Fix issue of controller reset required to send commands. For reset
   process, it may be required to send commands to the controller, but
   not during soft reset.  So add HISI_SAS_NOT_ACCEPT_CMD_BIT to prevent
   executing a task during this period.

2. Send a broadcast event in rescan topology to detect any topology
   changes during reset.

3. Previously it was not ensured that libsas has processed the PHY up
   and down events after reset. Potentially this could cause an issue
   that we still process the PHY event after reset. So resolve this by
   flushing shot workqueue in LLDD reset.

4. Port ID requires refresh after reset. The port ID generated after
   reset is not guaranteed to be the same as before reset, so it needs
   to be refreshed for each device's ITCT.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: smartpqi: change driver version to 1.1.2-125
Kevin Barnett [Thu, 10 Aug 2017 18:47:15 +0000 (13:47 -0500)]
scsi: smartpqi: change driver version to 1.1.2-125

Reviewed-by: Scott Benesh <scott.benesh@microsemi.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: smartpqi: add in new controller ids
Kevin Barnett [Thu, 10 Aug 2017 18:47:09 +0000 (13:47 -0500)]
scsi: smartpqi: add in new controller ids

Update the driver’s PCI IDs to match the latest Microsemi controllers

Reviewed-by: Scott Benesh <scott.benesh@microsemi.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: smartpqi: update kexec and power down support
Kevin Barnett [Thu, 10 Aug 2017 18:47:03 +0000 (13:47 -0500)]
scsi: smartpqi: update kexec and power down support

Add PQI reset to driver shutdown callback to work around controller bug.

During an 1.) OS shutdown or 2.) kexec outside of a kdump, the Linux
kernel will clear BME on our controller.

If BME is cleared during a controller/host PCIe transfer, the controller
will lock up.

So we perform a PQI reset in the driver's shutdown callback function to
eliminate the possibility of a controller/host PCIe transfer being
active when the kernel clears BME immediately after calling the driver's
shutdown callback.

Reviewed-by: Scott Benesh <scott.benesh@microsemi.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: smartpqi: cleanup doorbell register usage.
Kevin Barnett [Thu, 10 Aug 2017 18:46:57 +0000 (13:46 -0500)]
scsi: smartpqi: cleanup doorbell register usage.

Reviewed-by: Scott Benesh <scott.benesh@microsemi.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: smartpqi: update pqi passthru ioctl
Kevin Barnett [Thu, 10 Aug 2017 18:46:51 +0000 (13:46 -0500)]
scsi: smartpqi: update pqi passthru ioctl

 - make pass-thru requests bi-directional

Reviewed-by: Scott Benesh <scott.benesh@microsemi.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: smartpqi: enhance BMIC cache flush
Kevin Barnett [Thu, 10 Aug 2017 18:46:45 +0000 (13:46 -0500)]
scsi: smartpqi: enhance BMIC cache flush

 - distinguish between shutdown and non-shutdown.

Reviewed-by: Scott Benesh <scott.benesh@microsemi.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: smartpqi: add pqi reset quiesce support
Kevin Barnett [Thu, 10 Aug 2017 18:46:39 +0000 (13:46 -0500)]
scsi: smartpqi: add pqi reset quiesce support

Reviewed-by: Scott Benesh <scott.benesh@microsemi.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: dpt_i2o: remove redundant null check on array device
Colin Ian King [Wed, 9 Aug 2017 10:17:41 +0000 (11:17 +0100)]
scsi: dpt_i2o: remove redundant null check on array device

The null check on pHba->channel[chan].device is redundant because
device is an array and hence can never be null. Remove the check.

Detected by CoverityScan, CID#115362 ("Array compared against 0")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: qla2xxx: use dma_mapping_error to check map errors
Pan Bian [Tue, 8 Aug 2017 13:55:30 +0000 (21:55 +0800)]
scsi: qla2xxx: use dma_mapping_error to check map errors

The return value of dma_map_single() should be checked by
dma_mapping_error(). However, in function qla26xx_dport_diagnostics(), its
return value is checked against NULL, which could result in failures.

Signed-off-by: Pan Bian <bianpan2016@163.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: mvsas: replace kfree with scsi_host_put
Pan Bian [Tue, 8 Aug 2017 12:02:51 +0000 (20:02 +0800)]
scsi: mvsas: replace kfree with scsi_host_put

The return value of scsi_host_alloc() should be released by
scsi_host_put(). However, in function mvs_pci_init(), kfree()
is used. This patch replaces kfree() with scsi_host_put() to avoid
possible memory leaks.

Signed-off-by: Pan Bian <bianpan2016@163.com>
Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: pm8001: fix double free in pm8001_pci_probe
Pan Bian [Tue, 8 Aug 2017 11:40:30 +0000 (19:40 +0800)]
scsi: pm8001: fix double free in pm8001_pci_probe

In function pm8001_pci_probe(), on errors that the control flow jumps to
label err_out_ha_free, function pm8001_free() is called. In pm8001_free(),
scsi_host_put() is called to release shost, which keeps the return value
of scsi_host_alloc(). After pm8001_free() returns, kfree() is called to
free shost again, resulting in a double free bug. This patch removes
scsi_host_put() from pm8001_free() and explicitly calls scsi_host_put()
to release Scsi_Host in need.

Signed-off-by: Pan Bian <bianpan2016@163.com>
Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: megaraid_sas: fix allocate instance->pd_info twice
weiping [Tue, 8 Aug 2017 05:15:55 +0000 (13:15 +0800)]
scsi: megaraid_sas: fix allocate instance->pd_info twice

fix allocate instance->pd_info twice which was introduced by 96188a89cc6d.

Signed-off-by: weiping zhang <zhangweiping@didichuxing.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: esp_scsi: Always clear msg_out_len after MESSAGE OUT phase
Finn Thain [Fri, 4 Aug 2017 05:43:20 +0000 (01:43 -0400)]
scsi: esp_scsi: Always clear msg_out_len after MESSAGE OUT phase

After sending a message, always clear esp->msg_out_len. Otherwise,
eh_abort_handler may subsequently fail to send an ABORT TASK SET
message.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: esp_scsi: Avoid sending ABORT TASK SET messages
Finn Thain [Fri, 4 Aug 2017 05:43:20 +0000 (01:43 -0400)]
scsi: esp_scsi: Avoid sending ABORT TASK SET messages

If an LLD aborts a task set, it should complete the affected commands
with the appropriate result code. In a couple of cases esp_scsi doesn't
do so.

When the initiator receives an unhandled message, just respond by sending
a MESSAGE REJECT instead of ABORT TASK SET, and thus avoid the issue.

OTOH, a MESSAGE REJECT sent by a target can be taken as an indication
that the initiator messed up somehow. It isn't always possible to abort
correctly, so just fall back on a SCSI bus reset, which will complete the
affected commands with the appropriate result code.

For example, certain Apple (Sony) CD-ROM drives, when the non-existent
LUN 1 is scanned, can't handle the INQUIRY command. The problem is not
detected until the initiator gets a MESSAGE REJECT. Whenever esp_scsi
sees that message, it raises ATN and sends ABORT TASK SET -- but
neglects to complete the failed scmd.

The target then goes into DATA OUT phase (probably bogus), while the ESP
device goes into disconnected mode (surprising, given the bus phase).
The next Transfer Information command from esp_scsi then causes
an Invalid Command interrupt because that command is not valid when in
disconnected mode:

mac_esp: using PDMA for controller 0
mac_esp mac_esp.0: esp0: regs[50f10000:(null)] irq[19]
mac_esp mac_esp.0: esp0: is a ESP236, 16 MHz (ccf=4), SCSI ID 7
scsi host0: esp
scsi 0:0:0:0: Direct-Access     SEAGATE  ST318416N        0010 PQ: 0 ANSI: 3
scsi target0:0:0: Beginning Domain Validation
scsi target0:0:0: asynchronous
scsi target0:0:0: Domain Validation skipping write tests
scsi target0:0:0: Ending Domain Validation
scsi 0:0:3:0: CD-ROM            SONY     CD-ROM CDU-8003A 1.9a PQ: 0 ANSI: 2 CCS
scsi target0:0:3: Beginning Domain Validation
scsi target0:0:3: FAST-5 SCSI 2.0 MB/s ST (500 ns, offset 15)
scsi target0:0:3: Domain Validation skipping write tests
scsi target0:0:3: Ending Domain Validation
scsi host0: unexpected IREG 40
scsi host0: Dumping command log
scsi host0: ent[2] CMD val[c2] sreg[90] seqreg[cc] sreg2[00] ireg[20] ss[01] event[0c]
scsi host0: ent[3] CMD val[00] sreg[91] seqreg[04] sreg2[00] ireg[18] ss[00] event[0c]
scsi host0: ent[4] EVENT val[0d] sreg[91] seqreg[04] sreg2[00] ireg[18] ss[00] event[0c]
scsi host0: ent[5] EVENT val[03] sreg[91] seqreg[04] sreg2[00] ireg[18] ss[00] event[0d]
scsi host0: ent[6] CMD val[90] sreg[91] seqreg[04] sreg2[00] ireg[18] ss[00] event[03]
scsi host0: ent[7] EVENT val[05] sreg[91] seqreg[04] sreg2[00] ireg[18] ss[00] event[03]
scsi host0: ent[8] EVENT val[0d] sreg[93] seqreg[cc] sreg2[00] ireg[10] ss[00] event[05]
scsi host0: ent[9] CMD val[01] sreg[93] seqreg[cc] sreg2[00] ireg[10] ss[00] event[0d]
scsi host0: ent[10] CMD val[11] sreg[93] seqreg[cc] sreg2[00] ireg[10] ss[00] event[0d]
scsi host0: ent[11] EVENT val[0b] sreg[93] seqreg[cc] sreg2[00] ireg[10] ss[00] event[0d]
scsi host0: ent[12] CMD val[12] sreg[97] seqreg[cc] sreg2[00] ireg[08] ss[00] event[0b]
scsi host0: ent[13] EVENT val[0c] sreg[97] seqreg[cc] sreg2[00] ireg[08] ss[00] event[0b]
scsi host0: ent[14] CMD val[44] sreg[90] seqreg[cc] sreg2[00] ireg[20] ss[00] event[0c]
scsi host0: ent[15] CMD val[01] sreg[90] seqreg[cc] sreg2[00] ireg[20] ss[01] event[0c]
scsi host0: ent[16] CMD val[c2] sreg[90] seqreg[cc] sreg2[00] ireg[20] ss[01] event[0c]
scsi host0: ent[17] CMD val[00] sreg[87] seqreg[02] sreg2[00] ireg[18] ss[00] event[0c]
scsi host0: ent[18] EVENT val[0d] sreg[87] seqreg[02] sreg2[00] ireg[18] ss[00] event[0c]
scsi host0: ent[19] EVENT val[06] sreg[87] seqreg[02] sreg2[00] ireg[18] ss[00] event[0d]
scsi host0: ent[20] CMD val[01] sreg[87] seqreg[02] sreg2[00] ireg[18] ss[00] event[06]
scsi host0: ent[21] CMD val[10] sreg[87] seqreg[02] sreg2[00] ireg[18] ss[00] event[06]
scsi host0: ent[22] CMD val[1a] sreg[87] seqreg[ca] sreg2[00] ireg[08] ss[00] event[06]
scsi host0: ent[23] CMD val[12] sreg[87] seqreg[ca] sreg2[00] ireg[08] ss[00] event[06]
scsi host0: ent[24] EVENT val[0d] sreg[87] seqreg[ca] sreg2[00] ireg[08] ss[00] event[06]
scsi host0: ent[25] EVENT val[09] sreg[86] seqreg[ca] sreg2[00] ireg[10] ss[00] event[0d]
scsi host0: ent[26] CMD val[01] sreg[86] seqreg[ca] sreg2[00] ireg[10] ss[00] event[09]
scsi host0: ent[27] CMD val[10] sreg[86] seqreg[ca] sreg2[00] ireg[10] ss[00] event[09]
scsi host0: ent[28] EVENT val[0a] sreg[86] seqreg[ca] sreg2[00] ireg[10] ss[00] event[09]
scsi host0: ent[29] EVENT val[0d] sreg[80] seqreg[ca] sreg2[00] ireg[20] ss[00] event[0a]
scsi host0: ent[30] EVENT val[04] sreg[80] seqreg[ca] sreg2[00] ireg[20] ss[00] event[0d]
scsi host0: ent[31] CMD val[01] sreg[80] seqreg[ca] sreg2[00] ireg[20] ss[00] event[04]
scsi host0: ent[0] CMD val[90] sreg[80] seqreg[ca] sreg2[00] ireg[20] ss[00] event[04]
scsi host0: ent[1] EVENT val[05] sreg[80] seqreg[ca] sreg2[00] ireg[20] ss[00] event[04]
scsi target0:0:3: FAST-5 SCSI 2.0 MB/s ST (500 ns, offset 15)
scsi target0:0:0: asynchronous
sr 0:0:3:0: [sr0] scsi-1 drive
cdrom: Uniform CD-ROM driver Revision: 3.20
sd 0:0:0:0: Attached scsi generic sg0 type 0
sr 0:0:3:0: Attached scsi generic sg1 type 5

This patch resolves this issue because the bus reset causes the INQUIRY
command to fail earlier, and return the appropriate result code.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: esp_scsi: Clean up control flow and dead code
Finn Thain [Fri, 4 Aug 2017 05:43:19 +0000 (01:43 -0400)]
scsi: esp_scsi: Clean up control flow and dead code

This patch improves readability. There are no functional changes.

Since this touches on a questionable ESP_INTR_DC conditional, add some
commentary to help others who may (as I did) find themselves chasing an
"Invalid Command" error after the device flags this condition.

This cleanup also eliminates a warning from "make W=1":
drivers/scsi/esp_scsi.c: In function 'esp_finish_select':
drivers/scsi/esp_scsi.c:1233:5: warning: variable 'orig_select_state' set but not used [-Wunused-but-set-variable]
  u8 orig_select_state;

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: mac_esp: Fix PIO transfers for MESSAGE IN phase
Finn Thain [Fri, 4 Aug 2017 05:43:19 +0000 (01:43 -0400)]
scsi: mac_esp: Fix PIO transfers for MESSAGE IN phase

When in MESSAGE IN phase, the ESP device does not automatically
acknowledge each byte that is transferred by PIO. The mac_esp driver
neglects to explicitly ack them, which causes a timeout during messages
larger than one byte (e.g. tag bytes during reconnect). Fix this with an
ESP_CMD_MOK command after each byte.

The MESSAGE IN phase is also different in that each byte transferred
raises ESP_INTR_FDONE. So don't exit the transfer loop for this interrupt,
for this phase.

That resolves the "Reconnect IRQ2 timeout" error on those Macs which use
PIO transfers instead of PDMA. This patch also improves on the weak tests
for unexpected interrupts and phase changes during PIO transfers.

Tested-by: Stan Johnson <userm57@yahoo.com>
Fixes: 02507a80b35e ("[PATCH] [SCSI] mac_esp: fix PIO mode, take 2")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: mac_esp: Avoid type warning from sparse
Finn Thain [Fri, 4 Aug 2017 05:43:19 +0000 (01:43 -0400)]
scsi: mac_esp: Avoid type warning from sparse

Avoid the following warning from "make C=1":

  CHECK   drivers/scsi/mac_esp.c
drivers/scsi/mac_esp.c:357:30: warning: incorrect type in initializer (different address spaces)
drivers/scsi/mac_esp.c:357:30:    expected unsigned char [usertype] *fifo
drivers/scsi/mac_esp.c:357:30:    got void [noderef] <asn:2>*

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoarcmsr: add const to bin_attribute structures
Bhumika Goyal [Tue, 1 Aug 2017 14:21:14 +0000 (19:51 +0530)]
arcmsr: add const to bin_attribute structures

Add const to bin_attribute structures as they are only passed to the
functions system_{remove/create}_bin_file. The arguments passed are of
type const, so declare the structures to be const.

Done using Coccinelle.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: early returns for traces disabled via level
Martin Peschke [Fri, 28 Jul 2017 10:31:08 +0000 (12:31 +0200)]
scsi: zfcp: early returns for traces disabled via level

This patch adds early checks to avoid burning CPU cycles on
the assembly of trace entries which would be skipped anyway.

Introduce a static const variable to keep the trace level to check with
debug_level_enabled() in sync with the actual trace emit with
debug_event(). In order not to refactor the SAN tracing too much,
simply use a define instead.

This change is only for the non / semi hot paths,
while the actual (I/O) hot path was already improved earlier:
zfcp_dbf_scsi() is already guarded by its only caller _zfcp_dbf_scsi()
since commit dcd20e2316cd ("[SCSI] zfcp: Only collect SCSI debug data for
matching trace levels").
zfcp_dbf_hba_fsf_res() is already guarded by its only caller
zfcp_dbf_hba_fsf_response() since commit 2e261af84cdb ("[SCSI] zfcp: Only
collect FSF/HBA debug data for matching trace levels").

Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
[maier@linux.vnet.ibm.com: rebase, reword, default level 3 branch prediction]
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: clean up unnecessary module_param_named() with no_auto_port_rescan
Martin Peschke [Fri, 28 Jul 2017 10:31:07 +0000 (12:31 +0200)]
scsi: zfcp: clean up unnecessary module_param_named() with no_auto_port_rescan

Improves commit 43f60cbd56c4 ("[SCSI] zfcp: No automatic port_rescan on
events")

Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
[maier@linux.vnet.ibm.com: reword, underscore in description to match sysfs]
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: clean up a member of struct zfcp_qdio that was assigned but never used
Martin Peschke [Fri, 28 Jul 2017 10:31:06 +0000 (12:31 +0200)]
scsi: zfcp: clean up a member of struct zfcp_qdio that was assigned but never used

v2.6.38 commit a54ca0f62f95 ("[SCSI] zfcp: Redesign of the debug tracing
for HBA records.")
dropped trace information previously introduced with
v2.6.27 commit c3baa9a26c5a ("[SCSI] zfcp: Add information about interrupt
to trace.")
but kept and needlessly assigned a now no longer used struct field.

Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
[maier@linux.vnet.ibm.com: reword, added git history]
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: clean up no longer existent prototype from zfcp API header
Steffen Maier [Fri, 28 Jul 2017 10:31:05 +0000 (12:31 +0200)]
scsi: zfcp: clean up no longer existent prototype from zfcp API header

Commit a54ca0f62f95 ("[SCSI] zfcp: Redesign of the debug tracing for HBA
records.") refactored zfcp_dbf_hba_berr into zfcp_dbf_hba_bit_err
but added the prototype for the latter without removing it for the former.

Suggested-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: clean up redundant code with fall through in link down SRB switch case
Martin Peschke [Fri, 28 Jul 2017 10:31:04 +0000 (12:31 +0200)]
scsi: zfcp: clean up redundant code with fall through in link down SRB switch case

Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
[maier@linux.vnet.ibm.com: re-worded short description for more details]
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: fix kernel doc comment typos for struct zfcp_dbf_scsi
Steffen Maier [Fri, 28 Jul 2017 10:31:03 +0000 (12:31 +0200)]
scsi: zfcp: fix kernel doc comment typos for struct zfcp_dbf_scsi

Improves commit 250a1352b95e ("[SCSI] zfcp: Redesign of the debug tracing
for SCSI records.")

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: use endianness conversions with common FC(P) struct fields
Steffen Maier [Fri, 28 Jul 2017 10:31:02 +0000 (12:31 +0200)]
scsi: zfcp: use endianness conversions with common FC(P) struct fields

Just to silence sparse. Since zfcp only exists for s390 and
s390 is big endian, this has been working correctly without conversions
and all the new conversions are NOPs so no performance impact.

Nonetheless, use the conversion on the constant expression where possible.

NB: N_Port-IDs have always been handled with hton24 or ntoh24 conversions
because they also convert to / from character array.

Affected common code structs and .fields are:

HOT I/O PATH:
fcp_cmnd .fc_dl
   FCP command: regular SCSI I/O, including DIX case

SEMI-HOT I/O PATH:
fcp_cmnd .fc_dl
   recovery FCP command: task management function (LUN / target reset)
fcp_resp_ext
   FCP response having FCP_SNS_LEN_VAL with .fr_rsp_len .fr_sns_len
   FCP response having FCP_RESID_UNDER with .fr_resid

RECOVERY / DISCOVERY PATHS:
fc_ct_hdr .ct_cmd .ct_mr_size
   zfcp auto port scan [GPN_FT] with fc_gpn_ft_resp.fp_wwpn,
   recovery for returned port [GID_PN] with fc_ns_gid_pn.fn_wwpn,
   get symbolic port name [GSPN],
   register symbolic port name [RSPN] (NPIV only).
fc_els_rscn .rscn_plen
   incoming ELS (RSCN).
fc_els_flogi .fl_wwpn .fl_wwnn
   incoming ELS (PLOGI),
   port open response with .fl_csp.sp_bb_data .fl_cssp[0..3].cp_class,
   FCP channel physical port,
   point-to-point peer (P2P only).
fc_els_logo .fl_n_port_wwn
   incoming ELS (LOGO).
fc_els_adisc .adisc_wwnn .adisc_wwpn
   path test after RSCN for gone target port.

Since v4.10 commit 05de97003c77 ("linux/types.h: enable endian checks for
all sparse builds"), below sparse endianness reports appear by default.
Previously, one needed to pass argument CF="-D__CHECK_ENDIAN__" to make
as in: $ make C=1 CF="-D__CHECK_ENDIAN__" M=drivers/s390/scsi.

Silenced sparse warnings and one error:

$ make C=1 M=drivers/s390/scsi
...
  CHECK   drivers/s390/scsi/zfcp_dbf.c
drivers/s390/scsi/zfcp_dbf.c:463:22: warning: restricted __be16 degrades to integer
drivers/s390/scsi/zfcp_dbf.c:476:28: warning: restricted __be16 degrades to integer
  CC      drivers/s390/scsi/zfcp_dbf.o
...
  CHECK   drivers/s390/scsi/zfcp_fc.c
drivers/s390/scsi/zfcp_fc.c:263:26: warning: restricted __be16 degrades to integer
drivers/s390/scsi/zfcp_fc.c:299:41: warning: incorrect type in argument 2 (different base types)
drivers/s390/scsi/zfcp_fc.c:299:41:    expected unsigned long long [unsigned] [usertype] wwpn
drivers/s390/scsi/zfcp_fc.c:299:41:    got restricted __be64 [usertype] fl_wwpn
drivers/s390/scsi/zfcp_fc.c:309:40: warning: incorrect type in argument 2 (different base types)
drivers/s390/scsi/zfcp_fc.c:309:40:    expected unsigned long long [unsigned] [usertype] wwpn
drivers/s390/scsi/zfcp_fc.c:309:40:    got restricted __be64 [usertype] fl_n_port_wwn
drivers/s390/scsi/zfcp_fc.c:338:31: warning: restricted __be16 degrades to integer
drivers/s390/scsi/zfcp_fc.c:355:24: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fc.c:355:24:    expected restricted __be16 [usertype] ct_cmd
drivers/s390/scsi/zfcp_fc.c:355:24:    got unsigned short [unsigned] [usertype] cmd
drivers/s390/scsi/zfcp_fc.c:356:28: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fc.c:356:28:    expected restricted __be16 [usertype] ct_mr_size
drivers/s390/scsi/zfcp_fc.c:356:28:    got int
drivers/s390/scsi/zfcp_fc.c:379:36: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fc.c:379:36:    expected restricted __be64 [usertype] fn_wwpn
drivers/s390/scsi/zfcp_fc.c:379:36:    got unsigned long long [unsigned] [usertype] wwpn
drivers/s390/scsi/zfcp_fc.c:463:18: warning: restricted __be64 degrades to integer
drivers/s390/scsi/zfcp_fc.c:465:17: warning: cast from restricted __be64
drivers/s390/scsi/zfcp_fc.c:473:20: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fc.c:473:20:    expected unsigned long long [unsigned] [usertype] wwnn
drivers/s390/scsi/zfcp_fc.c:473:20:    got restricted __be64 [usertype] fl_wwnn
drivers/s390/scsi/zfcp_fc.c:474:29: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fc.c:474:29:    expected unsigned int [unsigned] [usertype] maxframe_size
drivers/s390/scsi/zfcp_fc.c:474:29:    got restricted __be16 [usertype] sp_bb_data
drivers/s390/scsi/zfcp_fc.c:476:30: warning: restricted __be16 degrades to integer
drivers/s390/scsi/zfcp_fc.c:478:30: warning: restricted __be16 degrades to integer
drivers/s390/scsi/zfcp_fc.c:480:30: warning: restricted __be16 degrades to integer
drivers/s390/scsi/zfcp_fc.c:482:30: warning: restricted __be16 degrades to integer
drivers/s390/scsi/zfcp_fc.c:500:28: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fc.c:500:28:    expected unsigned long long [unsigned] [usertype] wwnn
drivers/s390/scsi/zfcp_fc.c:500:28:    got restricted __be64 [usertype] adisc_wwnn
drivers/s390/scsi/zfcp_fc.c:502:38: warning: restricted __be64 degrades to integer
drivers/s390/scsi/zfcp_fc.c:541:40: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fc.c:541:40:    expected restricted __be64 [usertype] adisc_wwpn
drivers/s390/scsi/zfcp_fc.c:541:40:    got unsigned long long [unsigned] [usertype] port_name
drivers/s390/scsi/zfcp_fc.c:542:40: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fc.c:542:40:    expected restricted __be64 [usertype] adisc_wwnn
drivers/s390/scsi/zfcp_fc.c:542:40:    got unsigned long long [unsigned] [usertype] node_name
drivers/s390/scsi/zfcp_fc.c:669:16: warning: restricted __be16 degrades to integer
drivers/s390/scsi/zfcp_fc.c:696:24: warning: restricted __be64 degrades to integer
drivers/s390/scsi/zfcp_fc.c:699:54: warning: incorrect type in argument 2 (different base types)
drivers/s390/scsi/zfcp_fc.c:699:54:    expected unsigned long long [unsigned] [usertype] <noident>
drivers/s390/scsi/zfcp_fc.c:699:54:    got restricted __be64 [usertype] fp_wwpn
  CC      drivers/s390/scsi/zfcp_fc.o
  CHECK   drivers/s390/scsi/zfcp_fsf.c
drivers/s390/scsi/zfcp_fsf.c:479:34: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fsf.c:479:34:    expected unsigned long long [unsigned] [usertype] port_name
drivers/s390/scsi/zfcp_fsf.c:479:34:    got restricted __be64 [usertype] fl_wwpn
drivers/s390/scsi/zfcp_fsf.c:480:34: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fsf.c:480:34:    expected unsigned long long [unsigned] [usertype] node_name
drivers/s390/scsi/zfcp_fsf.c:480:34:    got restricted __be64 [usertype] fl_wwnn
drivers/s390/scsi/zfcp_fsf.c:506:36: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fsf.c:506:36:    expected unsigned long long [unsigned] [usertype] peer_wwpn
drivers/s390/scsi/zfcp_fsf.c:506:36:    got restricted __be64 [usertype] fl_wwpn
drivers/s390/scsi/zfcp_fsf.c:507:36: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fsf.c:507:36:    expected unsigned long long [unsigned] [usertype] peer_wwnn
drivers/s390/scsi/zfcp_fsf.c:507:36:    got restricted __be64 [usertype] fl_wwnn
drivers/s390/scsi/zfcp_fc.h:269:46: warning: restricted __be32 degrades to integer
drivers/s390/scsi/zfcp_fc.h:270:29: error: incompatible types in comparison expression (different base types)

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: use common code fcp_cmnd and fcp_resp with union in fsf_qtcb_bottom_io
Steffen Maier [Fri, 28 Jul 2017 10:31:01 +0000 (12:31 +0200)]
scsi: zfcp: use common code fcp_cmnd and fcp_resp with union in fsf_qtcb_bottom_io

This eases crash dump analysis by automatically dissecting these
protocol headers at least somewhat rather than getting a string
interpretation of large unstructured character array buffer fields.

Also, we can get rid of some unnecessary and error-prone type casts.

This change is possible since v2.6.33 commit 4318e08c84e4
("[SCSI] zfcp: Update FCP protocol related code").

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: clarify that we don't need "link" test on failed open port
Steffen Maier [Fri, 28 Jul 2017 10:31:00 +0000 (12:31 +0200)]
scsi: zfcp: clarify that we don't need "link" test on failed open port

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: more fitting constant for fc_ct_hdr.ct_reason on port scan response
Steffen Maier [Fri, 28 Jul 2017 10:30:59 +0000 (12:30 +0200)]
scsi: zfcp: more fitting constant for fc_ct_hdr.ct_reason on port scan response

v2.6.33 commit dbf5dfe9dbce ("[SCSI] zfcp: Use common code definitions for
FC CT structs") replaced own definitions with common code definitions.
While FC_BA_RJT_UNABLE happens to be defined with the same value 9 as
FC_FS_RJT_UNABL and thus also works, here we should use the latter from
fc_gs.h.
See also its use in libfc's fc_disc_gpn_ft_resp().

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: trace high part of "new" 64 bit SCSI LUN
Steffen Maier [Fri, 28 Jul 2017 10:30:58 +0000 (12:30 +0200)]
scsi: zfcp: trace high part of "new" 64 bit SCSI LUN

Complements debugging aspects of the otherwise functionally complete
v3.17 commit 9cb78c16f5da ("scsi: use 64-bit LUNs").

While I don't have access to a target exporting 3 or 4 level LUNs,
I did test it by explicitly attaching a non-existent fake 4 level LUN
by means of zfcp sysfs attribute "unit_add".
In order to see corresponding trace records of otherwise successful
events, we had to increase the trace level of area SCSI and HBA to 6.

$ echo 6 > /sys/kernel/debug/s390dbf/zfcp_0.0.1880_scsi/level
$ echo 6 > /sys/kernel/debug/s390dbf/zfcp_0.0.1880_hba/level

$ echo 0x4011402240334044 > \
  /sys/bus/ccw/drivers/zfcp/0.0.1880/0x50050763031bd327/unit_add

Example output formatted by an updated zfcpdbf from the s390-tools
package interspersed with kernel messages at scsi_logging_level=4605:

Timestamp      : ...
Area           : REC
Subarea        : 00
Level          : 1
Exception      : -
CPU ID         : ..
Caller         : 0x...
Record ID      : 1
Tag            : scsla_1
LUN            : 0x4011402240334044
WWPN           : 0x50050763031bd327
D_ID           : 0x00......
Adapter status : 0x5400050b
Port status    : 0x54000001
LUN status     : 0x41000000
Ready count    : 0x00000001
Running count  : 0x00000000
ERP want       : 0x01
ERP need       : 0x01

scsi 2:0:0:4630896905707208721: scsi scan: INQUIRY pass 1 length 36
scsi 2:0:0:4630896905707208721: scsi scan: INQUIRY successful with code 0x0

Timestamp      : ...
Area           : HBA
Subarea        : 00
Level          : 6
Exception      : -
CPU ID         : ..
Caller         : 0x...
Record ID      : 1
Tag            : fs_norm
Request ID     : 0x<inquiry2-req-id>
Request status : 0x00000010
FSF cmnd       : 0x00000001
FSF sequence no: 0x...
FSF issued     : ...
FSF stat       : 0x00000000
FSF stat qual  : 00000000 00000000 00000000 00000000
Prot stat      : 0x00000001
Prot stat qual : ........ ........ 00000000 00000000
Port handle    : 0x...
LUN handle     : 0x...
|
Timestamp      : ...
Area           : SCSI
Subarea        : 00
Level          : 6
Exception      : -
CPU ID         : ..
Caller         : 0x...
Record ID      : 1
Tag            : rsl_nor
Request ID     : 0x<inquiry2-req-id>
SCSI ID        : 0x00000000
SCSI LUN       : 0x40224011
SCSI LUN high  : 0x40444033 <=======================
SCSI result    : 0x00000000
SCSI retries   : 0x00
SCSI allowed   : 0x03
SCSI scribble  : 0x<inquiry2-req-id>
SCSI opcode    : 12000000 a4000000 00000000 00000000
FCP rsp inf cod: 0x00
FCP rsp IU     : 00000000 00000000 00000000 00000000
                 00000000 00000000

scsi 2:0:0:4630896905707208721: scsi scan: INQUIRY pass 2 length 164
scsi 2:0:0:4630896905707208721: scsi scan: INQUIRY successful with code 0x0
scsi 2:0:0:4630896905707208721: scsi scan: peripheral device type of 31, \
no device added

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 9cb78c16f5da ("scsi: use 64-bit LUNs")
Cc: <stable@vger.kernel.org> #3.17+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: trace HBA FSF response by default on dismiss or timedout late response
Steffen Maier [Fri, 28 Jul 2017 10:30:57 +0000 (12:30 +0200)]
scsi: zfcp: trace HBA FSF response by default on dismiss or timedout late response

At the default trace level, we only trace unsuccessful events including
FSF responses.

zfcp_dbf_hba_fsf_response() only used protocol status and FSF status to
decide on an unsuccessful response. However, this is only one of multiple
possible sources determining a failed struct zfcp_fsf_req.

An FSF request can also "fail" if its response runs into an ERP timeout
or if it gets dismissed because a higher level recovery was triggered
[trace tags "erscf_1" or "erscf_2" in zfcp_erp_strategy_check_fsfreq()].
FSF requests with ERP timeout are:
FSF_QTCB_EXCHANGE_CONFIG_DATA, FSF_QTCB_EXCHANGE_PORT_DATA,
FSF_QTCB_OPEN_PORT_WITH_DID or FSF_QTCB_CLOSE_PORT or
FSF_QTCB_CLOSE_PHYSICAL_PORT for target ports,
FSF_QTCB_OPEN_LUN, FSF_QTCB_CLOSE_LUN.
One example is slow queue processing which can cause follow-on errors,
e.g. FSF_PORT_ALREADY_OPEN after FSF_QTCB_OPEN_PORT_WITH_DID timed out.
In order to see the root cause, we need to see late responses even if the
channel presented them successfully with FSF_PROT_GOOD and FSF_GOOD.
Example trace records formatted with zfcpdbf from the s390-tools package:

Timestamp      : ...
Area           : REC
Subarea        : 00
Level          : 1
Exception      : -
CPU ID         : ..
Caller         : ...
Record ID      : 1
Tag            : fcegpf1
LUN            : 0xffffffffffffffff
WWPN           : 0x<WWPN>
D_ID           : 0x00<D_ID>
Adapter status : 0x5400050b
Port status    : 0x41200000
LUN status     : 0x00000000
Ready count    : 0x00000001
Running count  : 0x...
ERP want       : 0x02 ZFCP_ERP_ACTION_REOPEN_PORT
ERP need       : 0x02 ZFCP_ERP_ACTION_REOPEN_PORT
|
Timestamp      : ... 30 seconds later
Area           : REC
Subarea        : 00
Level          : 1
Exception      : -
CPU ID         : ..
Caller         : ...
Record ID      : 2
Tag            : erscf_2
LUN            : 0xffffffffffffffff
WWPN           : 0x<WWPN>
D_ID           : 0x00<D_ID>
Adapter status : 0x5400050b
Port status    : 0x41200000
LUN status     : 0x00000000
Request ID     : 0x<request_ID>
ERP status     : 0x10000000 ZFCP_STATUS_ERP_TIMEDOUT
ERP step       : 0x0800 ZFCP_ERP_STEP_PORT_OPENING
ERP action     : 0x02 ZFCP_ERP_ACTION_REOPEN_PORT
ERP count      : 0x00
|
Timestamp      : ... later than previous record
Area           : HBA
Subarea        : 00
Level          : 5 > default level => 3 <= default level
Exception      : -
CPU ID         : 00
Caller         : ...
Record ID      : 1
Tag            : fs_qtcb => fs_rerr
Request ID     : 0x<request_ID>
Request status : 0x00001010 ZFCP_STATUS_FSFREQ_DISMISSED
| ZFCP_STATUS_FSFREQ_CLEANUP
FSF cmnd       : 0x00000005
FSF sequence no: 0x...
FSF issued     : ... > 30 seconds ago
FSF stat       : 0x00000000 FSF_GOOD
FSF stat qual  : 00000000 00000000 00000000 00000000
Prot stat      : 0x00000001 FSF_PROT_GOOD
Prot stat qual : 00000000 00000000 00000000 00000000
Port handle    : 0x...
LUN handle     : 0x00000000
QTCB log length: ...
QTCB log info  : ...

In case of problems detecting that new responses are waiting on the input
queue, we sooner or later trigger adapter recovery due to an FSF request
timeout (trace tag "fsrth_1").
FSF requests with FSF request timeout are:
typically FSF_QTCB_ABORT_FCP_CMND; but theoretically also
FSF_QTCB_EXCHANGE_CONFIG_DATA or FSF_QTCB_EXCHANGE_PORT_DATA via sysfs,
FSF_QTCB_OPEN_PORT_WITH_DID or FSF_QTCB_CLOSE_PORT for WKA ports,
FSF_QTCB_FCP_CMND for task management function (LUN / target reset).
One or more pending requests can meanwhile have FSF_PROT_GOOD and FSF_GOOD
because the channel filled in the response via DMA into the request's QTCB.

In a theroretical case, inject code can create an erroneous FSF request
on purpose. If data router is enabled, it uses deferred error reporting.
A READ SCSI command can succeed with FSF_PROT_GOOD, FSF_GOOD, and
SAM_STAT_GOOD. But on writing the read data to host memory via DMA,
it can still fail, e.g. if an intentionally wrong scatter list does not
provide enough space. Rather than getting an unsuccessful response,
we get a QDIO activate check which in turn triggers adapter recovery.
One or more pending requests can meanwhile have FSF_PROT_GOOD and FSF_GOOD
because the channel filled in the response via DMA into the request's QTCB.
Example trace records formatted with zfcpdbf from the s390-tools package:

Timestamp      : ...
Area           : HBA
Subarea        : 00
Level          : 6 > default level => 3 <= default level
Exception      : -
CPU ID         : ..
Caller         : ...
Record ID      : 1
Tag            : fs_norm => fs_rerr
Request ID     : 0x<request_ID2>
Request status : 0x00001010 ZFCP_STATUS_FSFREQ_DISMISSED
| ZFCP_STATUS_FSFREQ_CLEANUP
FSF cmnd       : 0x00000001
FSF sequence no: 0x...
FSF issued     : ...
FSF stat       : 0x00000000 FSF_GOOD
FSF stat qual  : 00000000 00000000 00000000 00000000
Prot stat      : 0x00000001 FSF_PROT_GOOD
Prot stat qual : ........ ........ 00000000 00000000
Port handle    : 0x...
LUN handle     : 0x...
|
Timestamp      : ...
Area           : SCSI
Subarea        : 00
Level          : 3
Exception      : -
CPU ID         : ..
Caller         : ...
Record ID      : 1
Tag            : rsl_err
Request ID     : 0x<request_ID2>
SCSI ID        : 0x...
SCSI LUN       : 0x...
SCSI result    : 0x000e0000 DID_TRANSPORT_DISRUPTED
SCSI retries   : 0x00
SCSI allowed   : 0x05
SCSI scribble  : 0x<request_ID2>
SCSI opcode    : 28... Read(10)
FCP rsp inf cod: 0x00
FCP rsp IU     : 00000000 00000000 00000000 00000000
                                         ^^ SAM_STAT_GOOD
                 00000000 00000000

Only with luck in both above cases, we could see a follow-on trace record
of an unsuccesful event following a successful but late FSF response with
FSF_PROT_GOOD and FSF_GOOD. Typically this was the case for I/O requests
resulting in a SCSI trace record "rsl_err" with DID_TRANSPORT_DISRUPTED
[On ZFCP_STATUS_FSFREQ_DISMISSED, zfcp_fsf_protstatus_eval() sets
ZFCP_STATUS_FSFREQ_ERROR seen by the request handler functions as failure].
However, the reason for this follow-on trace was invisible because the
corresponding HBA trace record was missing at the default trace level
(by default hidden records with tags "fs_norm", "fs_qtcb", or "fs_open").

On adapter recovery, after we had shut down the QDIO queues, we perform
unsuccessful pseudo completions with flag ZFCP_STATUS_FSFREQ_DISMISSED
for each pending FSF request in zfcp_fsf_req_dismiss_all().
In order to find the root cause, we need to see all pseudo responses even
if the channel presented them successfully with FSF_PROT_GOOD and FSF_GOOD.

Therefore, check zfcp_fsf_req.status for ZFCP_STATUS_FSFREQ_DISMISSED
or ZFCP_STATUS_FSFREQ_ERROR and trace with a new tag "fs_rerr".

It does not matter that there are numerous places which set
ZFCP_STATUS_FSFREQ_ERROR after the location where we trace an FSF response
early. These cases are based on protocol status != FSF_PROT_GOOD or
== FSF_PROT_FSF_STATUS_PRESENTED and are thus already traced by default
as trace tag "fs_perr" or "fs_ferr" respectively.

NB: The trace record with tag "fssrh_1" for status read buffers on dismiss
all remains. zfcp_fsf_req_complete() handles this and returns early.
All other FSF request types are handled separately and as described above.

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 8a36e4532ea1 ("[SCSI] zfcp: enhancement of zfcp debug features")
Fixes: 2e261af84cdb ("[SCSI] zfcp: Only collect FSF/HBA debug data for matching trace levels")
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: fix payload with full FCP_RSP IU in SCSI trace records
Steffen Maier [Fri, 28 Jul 2017 10:30:56 +0000 (12:30 +0200)]
scsi: zfcp: fix payload with full FCP_RSP IU in SCSI trace records

If the FCP_RSP UI has optional parts (FCP_SNS_INFO or FCP_RSP_INFO) and
thus does not fit into the fsp_rsp field built into a SCSI trace record,
trace the full FCP_RSP UI with all optional parts as payload record
instead of just FCP_SNS_INFO as payload and
a 1 byte RSP_INFO_CODE part of FCP_RSP_INFO built into the SCSI record.

That way we would also get the full FCP_SNS_INFO in case a
target would ever send more than
min(SCSI_SENSE_BUFFERSIZE==96, ZFCP_DBF_PAY_MAX_REC==256)==96.

The mandatory part of FCP_RSP IU is only 24 bytes.
PAYload costs at least one full PAY record of 256 bytes anyway.
We cap to the hardware response size which is only FSF_FCP_RSP_SIZE==128.
So we can just put the whole FCP_RSP IU with any optional parts into
PAYload similarly as we do for SAN PAY since v4.9 commit aceeffbb59bb
("zfcp: trace full payload of all SAN records (req,resp,iels)").
This does not cause any additional trace records wasting memory.

Decoded trace records were confusing because they showed a hard-coded
sense data length of 96 even if the FCP_RSP_IU field FCP_SNS_LEN showed
actually less.

Since the same commit, we set pl_len for SAN traces to the full length of a
request/response even if we cap the corresponding trace.
In contrast, here for SCSI traces we set pl_len to the pre-computed
length of FCP_RSP IU considering SNS_LEN or RSP_LEN if valid.
Nonetheless we trace a hardcoded payload of length FSF_FCP_RSP_SIZE==128
if there were optional parts.
This makes it easier for the zfcpdbf tool to format only the relevant
part of the long FCP_RSP UI buffer. And any trailing information is still
available in the payload trace record just in case.

Rename the payload record tag from "fcp_sns" to "fcp_riu" to make the new
content explicit to zfcpdbf which can then pick a suitable field name such
as "FCP rsp IU all:" instead of "Sense info :"
Also, the same zfcpdbf can still be backwards compatible with "fcp_sns".

Old example trace record before this fix, formatted with the tool zfcpdbf
from s390-tools:

Timestamp      : ...
Area           : SCSI
Subarea        : 00
Level          : 3
Exception      : -
CPU id         : ..
Caller         : 0x...
Record id      : 1
Tag            : rsl_err
Request id     : 0x<request_id>
SCSI ID        : 0x...
SCSI LUN       : 0x...
SCSI result    : 0x00000002
SCSI retries   : 0x00
SCSI allowed   : 0x05
SCSI scribble  : 0x<request_id>
SCSI opcode    : 00000000 00000000 00000000 00000000
FCP rsp inf cod: 0x00
FCP rsp IU     : 00000000 00000000 00000202 00000000
                                       ^^==FCP_SNS_LEN_VALID
                 00000020 00000000
                 ^^^^^^^^==FCP_SNS_LEN==32
Sense len      : 96 <==min(SCSI_SENSE_BUFFERSIZE,ZFCP_DBF_PAY_MAX_REC)
Sense info     : 70000600 00000018 00000000 29000000
                 00000400 00000000 00000000 00000000
                 00000000 00000000 00000000 00000000<==superfluous
                 00000000 00000000 00000000 00000000<==superfluous
                 00000000 00000000 00000000 00000000<==superfluous
                 00000000 00000000 00000000 00000000<==superfluous

New example trace records with this fix:

Timestamp      : ...
Area           : SCSI
Subarea        : 00
Level          : 3
Exception      : -
CPU ID         : ..
Caller         : 0x...
Record ID      : 1
Tag            : rsl_err
Request ID     : 0x<request_id>
SCSI ID        : 0x...
SCSI LUN       : 0x...
SCSI result    : 0x00000002
SCSI retries   : 0x00
SCSI allowed   : 0x03
SCSI scribble  : 0x<request_id>
SCSI opcode    : a30c0112 00000000 02000000 00000000
FCP rsp inf cod: 0x00
FCP rsp IU     : 00000000 00000000 00000a02 00000200
                 00000020 00000000
FCP rsp IU len : 56
FCP rsp IU all : 00000000 00000000 00000a02 00000200
                                       ^^=FCP_RESID_UNDER|FCP_SNS_LEN_VALID
                 00000020 00000000 70000500 00000018
                 ^^^^^^^^==FCP_SNS_LEN
                                   ^^^^^^^^^^^^^^^^^
                 00000000 240000cb 00011100 00000000
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                 00000000 00000000
                 ^^^^^^^^^^^^^^^^^==FCP_SNS_INFO

Timestamp      : ...
Area           : SCSI
Subarea        : 00
Level          : 1
Exception      : -
CPU ID         : ..
Caller         : 0x...
Record ID      : 1
Tag            : lr_okay
Request ID     : 0x<request_id>
SCSI ID        : 0x...
SCSI LUN       : 0x...
SCSI result    : 0x00000000
SCSI retries   : 0x00
SCSI allowed   : 0x05
SCSI scribble  : 0x<request_id>
SCSI opcode    : <CDB of unrelated SCSI command passed to eh handler>
FCP rsp inf cod: 0x00
FCP rsp IU     : 00000000 00000000 00000100 00000000
                 00000000 00000008
FCP rsp IU len : 32
FCP rsp IU all : 00000000 00000000 00000100 00000000
                                       ^^==FCP_RSP_LEN_VALID
                 00000000 00000008 00000000 00000000
                          ^^^^^^^^==FCP_RSP_LEN
                                   ^^^^^^^^^^^^^^^^^==FCP_RSP_INFO

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 250a1352b95e ("[SCSI] zfcp: Redesign of the debug tracing for SCSI records.")
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: fix missing trace records for early returns in TMF eh handlers
Steffen Maier [Fri, 28 Jul 2017 10:30:55 +0000 (12:30 +0200)]
scsi: zfcp: fix missing trace records for early returns in TMF eh handlers

For problem determination we need to see that we were in scsi_eh
as well as whether and why we were successful or not.

The following commits introduced new early returns without adding
a trace record:

v2.6.35 commit a1dbfddd02d2
("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to scsi eh")
on fc_block_scsi_eh() returning != 0 which is FAST_IO_FAIL,

v2.6.30 commit 63caf367e1c9
("[SCSI] zfcp: Improve reliability of SCSI eh handlers in zfcp")
on not having gotten an FSF request after the maximum number of retry
attempts and thus could not issue a TMF and has to return FAILED.

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: a1dbfddd02d2 ("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to scsi eh")
Fixes: 63caf367e1c9 ("[SCSI] zfcp: Improve reliability of SCSI eh handlers in zfcp")
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: fix passing fsf_req to SCSI trace on TMF to correlate with HBA
Steffen Maier [Fri, 28 Jul 2017 10:30:54 +0000 (12:30 +0200)]
scsi: zfcp: fix passing fsf_req to SCSI trace on TMF to correlate with HBA

Without this fix we get SCSI trace records on task management functions
which cannot be correlated to HBA trace records because all fields
related to the FSF request are empty (zero).
Also, the FCP_RSP_IU is missing as well as any sense data if available.

This was caused by v2.6.14 commit 8a36e4532ea1 ("[SCSI] zfcp: enhancement
of zfcp debug features") introducing trace records for TMFs but
hard coding NULL for a possibly existing TMF FSF request.
The scsi_cmnd scribble is also zero or unrelated for the TMF request
so it also could not lookup a suitable FSF request from there.

A broken example trace record formatted with zfcpdbf from the s390-tools
package:

Timestamp      : ...
Area           : SCSI
Subarea        : 00
Level          : 1
Exception      : -
CPU ID         : ..
Caller         : 0x...
Record ID      : 1
Tag            : lr_fail
Request ID     : 0x0000000000000000
                   ^^^^^^^^^^^^^^^^ no correlation to HBA record
SCSI ID        : 0x<scsitarget>
SCSI LUN       : 0x<scsilun>
SCSI result    : 0x000e0000
SCSI retries   : 0x00
SCSI allowed   : 0x05
SCSI scribble  : 0x0000000000000000
SCSI opcode    : 2a000017 3bb80000 08000000 00000000
FCP rsp inf cod: 0x00
                   ^^ no TMF response
FCP rsp IU     : 00000000 00000000 00000000 00000000
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                 00000000 00000000
                 ^^^^^^^^^^^^^^^^^ no interesting FCP_RSP_IU
Sense len      : ...
^^^^^^^^^^^^^^^^^^^^ no sense data length
Sense info     : ...
^^^^^^^^^^^^^^^^^^^^ no sense data content, even if present

There are some true cases where we really do not have an FSF request:
"rsl_fai" from zfcp_dbf_scsi_fail_send() called for early
returns / completions in zfcp_scsi_queuecommand(),
"abrt_or", "abrt_bl", "abrt_ru", "abrt_ar" from
zfcp_scsi_eh_abort_handler() where we did not get as far,
"lr_nres", "tr_nres" from zfcp_task_mgmt_function() where we're
successful and do not need to do anything because adapter stopped.
For these cases it's correct to pass NULL for fsf_req to _zfcp_dbf_scsi().

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 8a36e4532ea1 ("[SCSI] zfcp: enhancement of zfcp debug features")
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: fix capping of unsuccessful GPN_FT SAN response trace records
Steffen Maier [Fri, 28 Jul 2017 10:30:53 +0000 (12:30 +0200)]
scsi: zfcp: fix capping of unsuccessful GPN_FT SAN response trace records

v4.9 commit aceeffbb59bb ("zfcp: trace full payload of all SAN records
(req,resp,iels)") fixed trace data loss of 2.6.38 commit 2c55b750a884
("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
necessary for problem determination, e.g. to see the
currently active zone set during automatic port scan.

While it already saves space by not dumping any empty residual entries
of the large successful GPN_FT response (4 pages), there are seldom cases
where the GPN_FT response is unsuccessful and likely does not have
FC_NS_FID_LAST set in fp_flags so we did not cap the trace record.
We typically see such case for an initiator WWPN, which is not in any zone.

Cap unsuccessful responses to at least the actual basic CT_IU response
plus whatever fits the SAN trace record built-in "payload" buffer
just in case there's trailing information
of which we would at least see the existence and its beginning.

In order not to erroneously cap successful responses, we need to swap
calling the trace function and setting the CT / ELS status to success (0).

Example trace record pair formatted with zfcpdbf:

Timestamp      : ...
Area           : SAN
Subarea        : 00
Level          : 1
Exception      : -
CPU ID         : ..
Caller         : 0x...
Record ID      : 1
Tag            : fssct_1
Request ID     : 0x<request_id>
Destination ID : 0x00fffffc
SAN req short  : 01000000 fc020000 01720ffc 00000000
                 00000008
SAN req length : 20
|
Timestamp      : ...
Area           : SAN
Subarea        : 00
Level          : 1
Exception      : -
CPU ID         : ..
Caller         : 0x...
Record ID      : 2
Tag            : fsscth2
Request ID     : 0x<request_id>
Destination ID : 0x00fffffc
SAN resp short : 01000000 fc020000 80010000 00090700
                 00000000 00000000 00000000 00000000 [trailing info]
                 00000000 00000000 00000000 00000000 [trailing info]
SAN resp length: 16384
San resp info  : 01000000 fc020000 80010000 00090700
                 00000000 00000000 00000000 00000000 [trailing info]
                 00000000 00000000 00000000 00000000 [trailing info]
                 00000000 00000000 00000000 00000000 [trailing info]
                 00000000 00000000 00000000 00000000 [trailing info]
                 00000000 00000000 00000000 00000000 [trailing info]
                 00000000 00000000 00000000 00000000 [trailing info]
                 00000000 00000000 00000000 00000000 [trailing info]
                 00000000 00000000 00000000 00000000 [trailing info]
                 00000000 00000000 00000000 00000000 [trailing info]
                 00000000 00000000 00000000 00000000 [trailing info]
                 00000000 00000000 00000000 00000000 [trailing info]
                 00000000 00000000 00000000 00000000 [trailing info]
                 00000000 00000000 00000000 00000000 [trailing info]
                 00000000 00000000 00000000 00000000 [trailing info]
                 00000000 00000000 00000000 00000000 [trailing info]

The fix saves all but one of the previously associated 64 PAYload trace
record chunks of size 256 bytes each.

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: aceeffbb59bb ("zfcp: trace full payload of all SAN records (req,resp,iels)")
Fixes: 2c55b750a884 ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: add handling for FCP_RESID_OVER to the fcp ingress path
Benjamin Block [Fri, 28 Jul 2017 10:30:52 +0000 (12:30 +0200)]
scsi: zfcp: add handling for FCP_RESID_OVER to the fcp ingress path

Up until now zfcp would just ignore the FCP_RESID_OVER flag in the FCP
response IU. When this flag is set, it is possible, in regards to the
FCP standard, that the storage-server processes the command normally, up
to the point where data is missing and simply ignores those.

In this case no CHECK CONDITION would be set, and because we ignored the
FCP_RESID_OVER flag we resulted in at least a data loss or even
-corruption as a follow-up error, depending on how the
applications/layers on top behave. To prevent this, we now set the
host-byte of the corresponding scsi_cmnd to DID_ERROR.

Other storage-behaviors, where the same condition results in a CHECK
CONDITION set in the answer, don't need to be changed as they are
handled in the mid-layer already.

Following is an example trace record decoded with zfcpdbf from the
s390-tools package. We forcefully injected a fc_dl which is one byte too
small:

Timestamp      : ...
Area           : SCSI
Subarea        : 00
Level          : 3
Exception      : -
CPU ID         : ..
Caller         : 0x...
Record ID      : 1
Tag            : rsl_err
Request ID     : 0x...
SCSI ID        : 0x...
SCSI LUN       : 0x...
SCSI result    : 0x00070000
                     ^^DID_ERROR
SCSI retries   : 0x..
SCSI allowed   : 0x..
SCSI scribble  : 0x...
SCSI opcode    : 2a000000 00000000 08000000 00000000
FCP rsp inf cod: 0x00
FCP rsp IU     : 00000000 00000000 00000400 00000001
                                       ^^fr_flags==FCP_RESID_OVER
                                         ^^fr_status==SAM_STAT_GOOD
                                            ^^^^^^^^fr_resid
                 00000000 00000000

As of now, we don't actively handle to possibility that a response IU
has both flags - FCP_RESID_OVER and FCP_RESID_UNDER - set at once.

Reported-by: Luke M. Hopkins <lmhopkin@us.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 553448f6c483 ("[SCSI] zfcp: Message cleanup")
Fixes: ea127f975424 ("[PATCH] s390 (7/7): zfcp host adapter.") (tglx/history.git)
Cc: <stable@vger.kernel.org> #2.6.33+
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: fix queuecommand for scsi_eh commands when DIX enabled
Steffen Maier [Fri, 28 Jul 2017 10:30:51 +0000 (12:30 +0200)]
scsi: zfcp: fix queuecommand for scsi_eh commands when DIX enabled

Since commit db007fc5e20c ("[SCSI] Command protection operation"),
scsi_eh_prep_cmnd() saves scmd->prot_op and temporarily resets it to
SCSI_PROT_NORMAL.
Other FCP LLDDs such as qla2xxx and lpfc shield their queuecommand()
to only access any of scsi_prot_sg...() if
(scsi_get_prot_op(cmd) != SCSI_PROT_NORMAL).

Do the same thing for zfcp, which introduced DIX support with
commit ef3eb71d8ba4 ("[SCSI] zfcp: Introduce experimental support for
DIF/DIX").

Otherwise, TUR SCSI commands as part of scsi_eh likely fail in zfcp,
because the regular SCSI command with DIX protection data, that scsi_eh
re-uses in scsi_send_eh_cmnd(), of course still has
(scsi_prot_sg_count() != 0) and so zfcp sends down bogus requests to the
FCP channel hardware.

This causes scsi_eh_test_devices() to have (finish_cmds == 0)
[not SCSI device is online or not scsi_eh_tur() failed]
so regular SCSI commands, that caused / were affected by scsi_eh,
are moved to work_q and scsi_eh_test_devices() itself returns false.
In turn, it unnecessarily escalates in our case in scsi_eh_ready_devs()
beyond host reset to finally scsi_eh_offline_sdevs()
which sets affected SCSI devices offline with the following kernel message:

"kernel: sd H:0:T:L: Device offlined - not ready after error recovery"

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: ef3eb71d8ba4 ("[SCSI] zfcp: Introduce experimental support for DIF/DIX")
Cc: <stable@vger.kernel.org> #2.6.36+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: convert bool-definitions to use 'true' instead of '1'
Benjamin Block [Fri, 28 Jul 2017 10:30:50 +0000 (12:30 +0200)]
scsi: zfcp: convert bool-definitions to use 'true' instead of '1'

Better form and cleans remaining warnings.

Found with scripts/coccinelle/misc/boolinit.cocci.

Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: Remove unneeded linux/miscdevice.h include
Corentin Labbe [Fri, 28 Jul 2017 10:30:49 +0000 (12:30 +0200)]
scsi: zfcp: Remove unneeded linux/miscdevice.h include

drivers/s390/scsi/zfcp_aux.c does not contain any miscdevice so the
inclusion of linux/miscdevice.h is unnecessary.

[maier@linux.vnet.ibm.com: just for the records, this is in fact a
 minor missing code cleanup of the following older "feature"
 which also dropped the only former use of a misc device in zfcp:
 commit 663e0890e31c ("[SCSI] zfcp: remove access control tables
    interface")
 commit b5dc3c4800cc ("[SCSI] zfcp: remove access control tables
    interface (keep sysfs files)")
 commit 1b33ef23946a ("zfcp: remove access control tables interface
     (port leftovers)")]

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: use setup_timer instead of init_timer
Lukáš Korenčik [Fri, 28 Jul 2017 10:30:48 +0000 (12:30 +0200)]
scsi: zfcp: use setup_timer instead of init_timer

Use initialization with setup_timer function instead of using
init_timer function and data fields. It improves readability.

Signed-off-by: Lukáš Korenčik <xkorenc1@fi.muni.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: zfcp: replace zfcp_qdio_sbale_count by sg_nents
LABBE Corentin [Fri, 28 Jul 2017 10:30:47 +0000 (12:30 +0200)]
scsi: zfcp: replace zfcp_qdio_sbale_count by sg_nents

The zfcp_qdio_sbale_count function do the same work than sg_nents().
So replace it by sg_nents() for removing duplicate code.

Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: libcxgbi: use ndev->ifindex to find route
Varun Prakash [Sat, 5 Aug 2017 14:06:11 +0000 (19:36 +0530)]
scsi: libcxgbi: use ndev->ifindex to find route

If cxgbi_ep_connect() is called with valid shost then find associated
ndev and use ndev->ifindex to find route.

Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: mpt3sas: Fix memory allocation failure test in 'mpt3sas_base_attach()'
Christophe JAILLET [Sun, 6 Aug 2017 22:51:29 +0000 (00:51 +0200)]
scsi: mpt3sas: Fix memory allocation failure test in 'mpt3sas_base_attach()'

In the lines above this test, 8 'kzalloc' are performed, but only 7
results are tested.

Add the missing one (i.e. '!ioc->port_enable_cmds.reply').

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: aic7xxx: regenerate firmware files
Michał Mirosław [Thu, 3 Aug 2017 23:28:09 +0000 (01:28 +0200)]
scsi: aic7xxx: regenerate firmware files

Regenerate firmware files to make cleaner base for following fix.
This removes some unused definitions and reorders some #defines, but
the code remains the same.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: aic7xxx: fix firmware build deps
Michał Mirosław [Thu, 3 Aug 2017 23:28:08 +0000 (01:28 +0200)]
scsi: aic7xxx: fix firmware build deps

We need to override Kbuild rules for copying shipped files, otherwise
aic7xxx_reg.h and aic7xxx_reg_print.c will be ovewritten by old versions.

Fixes: 516b7db593f3a541e2e98867575c3c697f41a247
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: aic7xxx: remove empty function
Michał Mirosław [Thu, 3 Aug 2017 23:28:07 +0000 (01:28 +0200)]
scsi: aic7xxx: remove empty function

ahc_platform_dump_card_state() does nothing. Remove it.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoMAINTAINERS: Add myself to S390 ZFCP DRIVER as a co-maintainer
Benjamin Block [Fri, 28 Jul 2017 10:29:43 +0000 (12:29 +0200)]
MAINTAINERS: Add myself to S390 ZFCP DRIVER as a co-maintainer

I have been working with Steffen on zFCP for quite a while now and we
decided adding me as a co-maintainer might be a good thing.

Acked-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: fc: start decoupling fc_block_scsi_eh from scsi_cmnd
Steffen Maier [Tue, 25 Jul 2017 14:14:24 +0000 (16:14 +0200)]
scsi: fc: start decoupling fc_block_scsi_eh from scsi_cmnd

Scsi_cmnd is an unsuitable argument for eh_device_reset_handler(),
eh_target_reset_handler(), and eh_host_reset_handler() which do not have
the scope of one single SCSI command.  These callbacks tend to use
fc_block_scsi_eh() requiring scsi_cmnd.  In order to start decoupling
above eh callbacks from scsi_cmnd, introduce a new variant of the
function called fc_block_rport() taking an fc_rport as argument.
Refactor the old fc_block_scsi_eh() to simply delegate to
fc_block_rport().

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: qla2xxx: Fix remoteport disconnect for FC-NVMe
himanshu.madhani@cavium.com [Fri, 21 Jul 2017 16:32:27 +0000 (09:32 -0700)]
scsi: qla2xxx: Fix remoteport disconnect for FC-NVMe

Signed-off-by: Duane Grigsby <duane.grigsby@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: qla2xxx: Simpify unregistration of FC-NVMe local/remote ports
himanshu.madhani@cavium.com [Fri, 21 Jul 2017 16:32:26 +0000 (09:32 -0700)]
scsi: qla2xxx: Simpify unregistration of FC-NVMe local/remote ports

Simplified waiting for unregister local/remote FC-NVMe ports
to complete cleanup.

Signed-off-by: Duane Grigsby <duane.grigsby@cavium.com>
Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Anil Gurumurthy <anil.gurumurthy@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: qla2xxx: Added change to enable ZIO for FC-NVMe devices
Duane Grigsby [Fri, 21 Jul 2017 16:32:25 +0000 (09:32 -0700)]
scsi: qla2xxx: Added change to enable ZIO for FC-NVMe devices

Add support to the driver to set the exchange threshold value for
the number of outstanding AENs.

Signed-off-by: Duane Grigsby <duane.grigsby@cavium.com>
Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Anil Gurumurthy <anil.gurumurthy@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: qla2xxx: Move function prototype to correct header
himanshu.madhani@cavium.com [Fri, 21 Jul 2017 16:32:24 +0000 (09:32 -0700)]
scsi: qla2xxx: Move function prototype to correct header

Cc: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: qla2xxx: Cleanup FC-NVMe code
himanshu.madhani@cavium.com [Fri, 21 Jul 2017 16:32:23 +0000 (09:32 -0700)]
scsi: qla2xxx: Cleanup FC-NVMe code

This patch does not change any functionality.

Following cleanups have been done as requested by reviewer

- Changed waitQ --> waitq
- Collapsed multiple debug statements into single
- Remove extra parentheses in if-else as per operator precedence
- Remove unnecessary casting

Cc: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: remove DRIVER_ATTR() usage
Greg Kroah-Hartman [Wed, 19 Jul 2017 12:50:06 +0000 (14:50 +0200)]
scsi: remove DRIVER_ATTR() usage

It's better to use the DRIVER_ATTR_RW() and DRIVER_ATTR_RO() macros to
explicitly show that this is a read/write or read/only sysfs file.  So
convert the remaining SCSI drivers that use the old style to use the
newer macros.

Bonus is that this removes some checkpatch.pl warnings :)

This is part of a series to drop DRIVER_ATTR() from the tree entirely.

Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Cc: Willem Riede <osst@riede.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: Convert to using %pOF instead of full_name
Rob Herring [Tue, 18 Jul 2017 21:43:28 +0000 (16:43 -0500)]
scsi: Convert to using %pOF instead of full_name

Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.

Signed-off-by: Rob Herring <robh@kernel.org>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: ufs: changing maintainer
Prabu Thangamuthu [Tue, 18 Jul 2017 09:15:58 +0000 (09:15 +0000)]
scsi: ufs: changing maintainer

As per internal decision, Joao Pinto will be maintainer for DWC UFS
driver.

Signed-off-by: Prabu Thangamuthu <prabut@synopsys.com>
Acked-by: Joao Pinto <Joao.Pinto@synopsys.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: fusion: fix string overflow warning
Arnd Bergmann [Mon, 17 Jul 2017 12:00:00 +0000 (14:00 +0200)]
scsi: fusion: fix string overflow warning

gcc points out a theorerical string overflow:

drivers/message/fusion/mptbase.c: In function 'mpt_detach':
drivers/message/fusion/mptbase.c:2103:17: error: '%s' directive writing up to 31 bytes into a region of size 28 [-Werror=format-overflow=]
sprintf(pname, MPT_PROCFS_MPTBASEDIR "/%s/summary", ioc->name);
               ^~~~~
drivers/message/fusion/mptbase.c:2103:2: note: 'sprintf' output between 13 and 44 bytes into a destination of size 32

We can simply double the size of the local buffer here to be on the
safe side, and using snprintf() instead of sprintf() protects us
if ioc->name was not terminated properly.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: gdth: increase the procfs event buffer size
Arnd Bergmann [Fri, 14 Jul 2017 12:06:59 +0000 (14:06 +0200)]
scsi: gdth: increase the procfs event buffer size

We print a 256 byte event string into a buffer that is only 161
bytes long, this is clearly wrong:

drivers/scsi/gdth_proc.c: In function 'gdth_show_info':
drivers/scsi/gdth.c:3660:41: error: '%s' directive writing up to 255 bytes into a region of size between 141 and 150 [-Werror=format-overflow=]
             sprintf(buffer,"Adapter %d: %s\n",
                                         ^~
/git/arm-soc/drivers/scsi/gdth.c:3660:13: note: 'sprintf' output between 13 and 277 bytes into a destination of size 161
             sprintf(buffer,"Adapter %d: %s\n",
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                 dvr->eu.async.ionode,dvr->event_string);
                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

gcc calculates that the worst case buffer size would be 277 bytes,
so we can use that.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: fnic: fix format string overflow warning
Arnd Bergmann [Fri, 14 Jul 2017 12:06:58 +0000 (14:06 +0200)]
scsi: fnic: fix format string overflow warning

The MSI interrupt name can require 11 bytes in addition to the device name,
for a total of 23 bytes:

drivers/scsi/fnic/fnic_isr.c: In function 'fnic_request_intr':
drivers/scsi/fnic/fnic_isr.c:192:4: error: '-fcs-rq' directive writing 7 bytes into a region of size between 5 and 16 [-Werror=format-overflow=]
    "%.11s-fcs-rq", fnic->name);
drivers/scsi/fnic/fnic_isr.c:206:3: note: 'sprintf' output between 12 and 23 bytes into a destination of size 16
   sprintf(fnic->msix[FNIC_MSIX_ERR_NOTIFY].devname,
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    "%.11s-err-notify", fnic->name);

This extends the buffer to fit any possible value.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: gdth: avoid buffer overflow warning
Arnd Bergmann [Fri, 14 Jul 2017 12:06:57 +0000 (14:06 +0200)]
scsi: gdth: avoid buffer overflow warning

gcc notices that we would overflow the buffer for the
inquiry of the product name if we have too many adapters:

drivers/scsi/gdth.c: In function 'gdth_next':
drivers/scsi/gdth.c:2357:29: warning: 'sprintf' may write a terminating nul past the end of the destination [-Wformat-overflow=]
         sprintf(inq.product,"Host Drive  #%02d",t);
                             ^~~~~~~~~~~~~~~~~~~
drivers/scsi/gdth.c:2357:9: note: 'sprintf' output between 16 and 17 bytes into a destination of size 16
         sprintf(inq.product,"Host Drive  #%02d",t);

This won't happen in practice, so just use snprintf to
truncate the string.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: mpt3sas: fix format overflow warning
Arnd Bergmann [Fri, 14 Jul 2017 12:06:55 +0000 (14:06 +0200)]
scsi: mpt3sas: fix format overflow warning

We print the driver name into one string and then add and ID
and copy it into a second string of the same length, at which
point gcc complains about a possible overflow:

drivers/scsi/mpt3sas/mpt3sas_scsih.c: In function '_scsih_probe':
drivers/scsi/mpt3sas/mpt3sas_scsih.c:8884:21: error: '_cm' directive writing 3 bytes into a region of size between 1 and 32 [-Werror=format-overflow=]
printf(ioc->name, "%s_cm%d", ioc->driver_name, ioc->id);
                  ^~~~~~~~~
drivers/scsi/mpt3sas/mpt3sas_scsih.c:8884:21: note: directive argument in the range [0, 255]
drivers/scsi/mpt3sas/mpt3sas_scsih.c:8884:2: note: 'sprintf' output between 5 and 38 bytes into a destination of size 32
  sprintf(ioc->name, "%s_cm%d", ioc->driver_name, ioc->id);
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Making the first string shorter is sufficient to avoid the
warning here, as we know it can only contain either "mpt2sas"
or "mpt3sas".

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: megaraid: fix format-overflow warning
Arnd Bergmann [Fri, 14 Jul 2017 12:06:54 +0000 (14:06 +0200)]
scsi: megaraid: fix format-overflow warning

gcc-7 complains that the firmware version strings might overflow
for some values:

drivers/scsi/megaraid.c: In function 'megaraid_probe_one':
drivers/scsi/megaraid.c:314:33: error: '%d' directive writing between 1 and 2 bytes into a region of size between 1 and 2 [-Werror=format-overflow=]
drivers/scsi/megaraid.c:314:33: note: directive argument in the range [0, 15]
drivers/scsi/megaraid.c:314:3: note: 'sprintf' output between 7 and 9 bytes into a destination of size 7
drivers/scsi/megaraid.c:320:35: error: '%d' directive writing between 1 and 2 bytes into a region of size between 1 and 2 [-Werror=format-overflow=]
drivers/scsi/megaraid.c:320:35: note: directive argument in the range [0, 15]
drivers/scsi/megaraid.c:320:3: note: 'sprintf' output between 7 and 9 bytes into a destination of size 7

This makes the code use a truncating snprintf() instead, which shuts
up that warning.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: pmcraid: Replace PCI pool old API
Romain Perier [Thu, 6 Jul 2017 08:13:09 +0000 (10:13 +0200)]
scsi: pmcraid: Replace PCI pool old API

The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Acked-by: Peter Senna Tschudin <peter.senna@collabora.com>
Tested-by: Peter Senna Tschudin <peter.senna@collabora.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: mvsas: Replace PCI pool old API
Romain Perier [Thu, 6 Jul 2017 08:13:08 +0000 (10:13 +0200)]
scsi: mvsas: Replace PCI pool old API

The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Reviewed-by: Peter Senna Tschudin <peter.senna@collabora.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: mpt3sas: Replace PCI pool old API
Romain Perier [Thu, 6 Jul 2017 08:13:07 +0000 (10:13 +0200)]
scsi: mpt3sas: Replace PCI pool old API

The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Reviewed-by: Peter Senna Tschudin <peter.senna@collabora.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: megaraid: Replace PCI pool old API
Romain Perier [Thu, 6 Jul 2017 08:13:06 +0000 (10:13 +0200)]
scsi: megaraid: Replace PCI pool old API

The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Reviewed-by: Peter Senna Tschudin <peter.senna@collabora.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: lpfc: Replace PCI pool old API
Romain Perier [Thu, 6 Jul 2017 08:13:05 +0000 (10:13 +0200)]
scsi: lpfc: Replace PCI pool old API

The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API. It also updates
some comments, accordingly.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Reviewed-by: Peter Senna Tschudin <peter.senna@collabora.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: csiostor: Replace PCI pool old API
Romain Perier [Thu, 6 Jul 2017 08:13:04 +0000 (10:13 +0200)]
scsi: csiostor: Replace PCI pool old API

The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API. It also updates
the name of some variables and the content of comments, accordingly.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Reviewed-by: Peter Senna Tschudin <peter.senna@collabora.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: be2iscsi: Replace PCI pool old API
Romain Perier [Thu, 6 Jul 2017 08:13:03 +0000 (10:13 +0200)]
scsi: be2iscsi: Replace PCI pool old API

The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Acked-by: Peter Senna Tschudin <peter.senna@collabora.com>
Tested-by: Peter Senna Tschudin <peter.senna@collabora.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: g_NCR5380: Two DTC436 PDMA workarounds
Ondrej Zary [Mon, 3 Jul 2017 07:59:06 +0000 (03:59 -0400)]
scsi: g_NCR5380: Two DTC436 PDMA workarounds

Limit PDMA send to 512 B to avoid data corruption on DTC3181E. The
corruption is always the same: one byte missing at the beginning of a
128 B block. It happens only with slow Quantum LPS 240 drive, not with
faster IBM DORS-32160. It's not clear what causes this. Documentation
for the DTC436 chip has not been made available.

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: g_NCR5380: Re-work PDMA loops
Ondrej Zary [Mon, 3 Jul 2017 07:59:06 +0000 (03:59 -0400)]
scsi: g_NCR5380: Re-work PDMA loops

The polling loops in pread() and pwrite() can easily become infinite
loops and hang the machine.

Merge the IRQ check into host buffer wait loop and add polling limit.

Also place a limit on polling for 53C80 registers accessibility.

[Use NCR5380_poll_politely2() for register polling. Rely on polling for
gated IRQ rather than polling for phase error, like the algorithm in the
53c400 datasheet. Move DTC436 workarounds into a separate patch.
Factor-out common code as wait_for_53c80_access(). Rework the residual
calculations. -- F.T.]

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: g_NCR5380: Use unambiguous terminology for PDMA send and receive
Finn Thain [Mon, 3 Jul 2017 07:59:06 +0000 (03:59 -0400)]
scsi: g_NCR5380: Use unambiguous terminology for PDMA send and receive

The word "read" may be used to mean "DMA read operation" or "SCSI READ
command", though a READ command implies writing to memory.

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: g_NCR5380: Cleanup comments and whitespace
Finn Thain [Mon, 3 Jul 2017 07:59:06 +0000 (03:59 -0400)]
scsi: g_NCR5380: Cleanup comments and whitespace

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: g_NCR5380: End PDMA transfer correctly on target disconnection
Ondrej Zary [Mon, 3 Jul 2017 07:59:05 +0000 (03:59 -0400)]
scsi: g_NCR5380: End PDMA transfer correctly on target disconnection

When an IRQ arrives during PDMA transfer, pread() and pwrite() return
without waiting for the 53C80 registers to be ready and this ends up
messing up the chip state. This was observed with SONY CDU-55S which is
slow enough to disconnect during 4096-byte reads.

IRQ during PDMA is not an error so don't return -1. Instead, store the
remaining byte count for use by NCR5380_dma_residual().

[Poll for the BASR_END_DMA_TRANSFER condition rather than remove the
error message -- F.T.]

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: g_NCR5380: Fix PDMA transfer size
Ondrej Zary [Mon, 3 Jul 2017 07:59:05 +0000 (03:59 -0400)]
scsi: g_NCR5380: Fix PDMA transfer size

generic_NCR5380_dma_xfer_len() incorrectly uses cmd->transfersize which
causes rescan-scsi-bus and CD-ROM access to hang the system.  Use
cmd->SCp.this_residual instead, like other NCR5380 drivers.

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: aacraid: complete all commands during bus reset
Hannes Reinecke [Fri, 30 Jun 2017 17:18:12 +0000 (19:18 +0200)]
scsi: aacraid: complete all commands during bus reset

When issuing a bus reset we should complete all commands, not
just the command triggering the reset.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: aacraid: add fib flag to mark scsi command callback
Hannes Reinecke [Fri, 30 Jun 2017 17:18:11 +0000 (19:18 +0200)]
scsi: aacraid: add fib flag to mark scsi command callback

To correctly identify which fib has a scsi command callback this
patch implements a flag FIB_CONTEXT_FLAG_SCSI_CMD.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: aacraid: enable sending of TMFs from aac_hba_send()
Hannes Reinecke [Fri, 30 Jun 2017 17:18:10 +0000 (19:18 +0200)]
scsi: aacraid: enable sending of TMFs from aac_hba_send()

aac_hba_send() will return FAILED for any non-SCSI command requests,
failing any TMFs. This patch updates the check to allow TMFs.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: aacraid: use aac_tmf_callback for reset fib
Hannes Reinecke [Fri, 30 Jun 2017 17:18:09 +0000 (19:18 +0200)]
scsi: aacraid: use aac_tmf_callback for reset fib

When sending a reset fib we shouldn't rely on the scsi command,
but rather set the TMF status in the map_info->reset_state variable.
That allows us to send a TMF independent on a scsi command.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: aacraid: split off device, target, and bus reset
Hannes Reinecke [Fri, 30 Jun 2017 17:18:08 +0000 (19:18 +0200)]
scsi: aacraid: split off device, target, and bus reset

Split off device, target, and bus reset functionality into
individual functions.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: aacraid: split off host reset
Hannes Reinecke [Fri, 30 Jun 2017 17:18:07 +0000 (19:18 +0200)]
scsi: aacraid: split off host reset

Split off the host reset parts of aac_eh_reset() into a separate
host reset function.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoscsi: aacraid: split off functions to generate reset FIB
Hannes Reinecke [Fri, 30 Jun 2017 17:18:06 +0000 (19:18 +0200)]
scsi: aacraid: split off functions to generate reset FIB

Split off reset FIB generation into separate functions.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
6 years agoLinux 4.13-rc4
Linus Torvalds [Mon, 7 Aug 2017 01:44:49 +0000 (18:44 -0700)]
Linux 4.13-rc4

6 years agoMerge tag 'platform-drivers-x86-v4.13-4' of git://git.infradead.org/linux-platform...
Linus Torvalds [Sun, 6 Aug 2017 23:11:34 +0000 (16:11 -0700)]
Merge tag 'platform-drivers-x86-v4.13-4' of git://git.infradead.org/linux-platform-drivers-x86

Pull x86 platform driver fix from Darren Hart:
 "Fix loop preventing some platforms from waking up via the power button
  in s2idle:

   - intel-vbtn: match power button on press rather than release"

* tag 'platform-drivers-x86-v4.13-4' of git://git.infradead.org/linux-platform-drivers-x86:
  platform/x86: intel-vbtn: match power button on press rather than release

6 years agoMerge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 6 Aug 2017 19:31:17 +0000 (12:31 -0700)]
Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
 "A large number of ext4 bug fixes and cleanups for v4.13"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix copy paste error in ext4_swap_extents()
  ext4: fix overflow caused by missing cast in ext4_resize_fs()
  ext4, project: expand inode extra size if possible
  ext4: cleanup ext4_expand_extra_isize_ea()
  ext4: restructure ext4_expand_extra_isize
  ext4: fix forgetten xattr lock protection in ext4_expand_extra_isize
  ext4: make xattr inode reads faster
  ext4: inplace xattr block update fails to deduplicate blocks
  ext4: remove unused mode parameter
  ext4: fix warning about stack corruption
  ext4: fix dir_nlink behaviour
  ext4: silence array overflow warning
  ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesize
  ext4: release discard bio after sending discard commands
  ext4: convert swap_inode_data() over to use swap() on most of the fields
  ext4: error should be cleared if ea_inode isn't added to the cache
  ext4: Don't clear SGID when inheriting ACLs
  ext4: preserve i_mode if __ext4_set_acl() fails
  ext4: remove unused metadata accounting variables
  ext4: correct comment references to ext4_ext_direct_IO()

6 years agoMerge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Linus Torvalds [Sun, 6 Aug 2017 18:52:01 +0000 (11:52 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus

Pull MIPS fixes from Ralf Baechle:
 "This fixes two build issues for ralink platforms, both due to missing
  #includes which used to be included indirectly via other headers"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: ralink: mt7620: Add missing header
  MIPS: ralink: Fix build error due to missing header

6 years agoFix compat_sys_sigpending breakage
Dmitry V. Levin [Sat, 5 Aug 2017 20:00:50 +0000 (23:00 +0300)]
Fix compat_sys_sigpending breakage

The latest change of compat_sys_sigpending in commit 8f13621abced
("sigpending(): move compat to native") has broken it in two ways.

First, it tries to write 4 bytes more than userspace expects:
sizeof(old_sigset_t) == sizeof(long) == 8 instead of
sizeof(compat_old_sigset_t) == sizeof(u32) == 4.

Second, on big endian architectures these bytes are being written in the
wrong order.

This bug was found by strace test suite.

Reported-by: Anatoly Pugachev <matorola@gmail.com>
Inspired-by: Eugene Syromyatnikov <evgsyr@gmail.com>
Fixes: 8f13621abced ("sigpending(): move compat to native")
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoext4: fix copy paste error in ext4_swap_extents()
Maninder Singh [Sun, 6 Aug 2017 05:33:07 +0000 (01:33 -0400)]
ext4: fix copy paste error in ext4_swap_extents()

This bug was found by a static code checker tool for copy paste
problems.

Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Signed-off-by: Vaneet Narang <v.narang@samsung.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
6 years agoext4: fix overflow caused by missing cast in ext4_resize_fs()
Jerry Lee [Sun, 6 Aug 2017 05:18:31 +0000 (01:18 -0400)]
ext4: fix overflow caused by missing cast in ext4_resize_fs()

On a 32-bit platform, the value of n_blcoks_count may be wrong during
the file system is resized to size larger than 2^32 blocks.  This may
caused the superblock being corrupted with zero blocks count.

Fixes: 1c6bd7173d66
Signed-off-by: Jerry Lee <jerrylee@qnap.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org # 3.7+
6 years agoext4, project: expand inode extra size if possible
Miao Xie [Sun, 6 Aug 2017 05:00:49 +0000 (01:00 -0400)]
ext4, project: expand inode extra size if possible

When upgrading from old format, try to set project id
to old file first time, it will return EOVERFLOW, but if
that file is dirtied(touch etc), changing project id will
be allowed, this might be confusing for users, we could
try to expand @i_extra_isize here too.

Reported-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
6 years agoext4: cleanup ext4_expand_extra_isize_ea()
Miao Xie [Sun, 6 Aug 2017 04:55:48 +0000 (00:55 -0400)]
ext4: cleanup ext4_expand_extra_isize_ea()

Clean up some goto statement, make ext4_expand_extra_isize_ea() clearer.

Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
6 years agoext4: restructure ext4_expand_extra_isize
Miao Xie [Sun, 6 Aug 2017 04:40:01 +0000 (00:40 -0400)]
ext4: restructure ext4_expand_extra_isize

Current ext4_expand_extra_isize just tries to expand extra isize, if
someone is holding xattr lock or some check fails, it will give up.
So rename its name to ext4_try_to_expand_extra_isize.

Besides that, we clean up unnecessary check and move some relative checks
into it.

Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
6 years agoext4: fix forgetten xattr lock protection in ext4_expand_extra_isize
Miao Xie [Sun, 6 Aug 2017 04:27:38 +0000 (00:27 -0400)]
ext4: fix forgetten xattr lock protection in ext4_expand_extra_isize

We should avoid the contention between the i_extra_isize update and
the inline data insertion, so move the xattr trylock in front of
i_extra_isize update.

Signed-off-by: Miao Xie <miaoxie@huawei.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>