Sreekanth Reddy [Tue, 30 Mar 2021 10:50:04 +0000 (16:20 +0530)]
scsi: mpt3sas: Only one vSES is present even when IOC has multi vSES
Whenever the driver is adding a vSES to virtual-phys list it is
reinitializing the list head. Hence those vSES devices which were added
previously are lost.
Stop reinitializing the list every time a new vSES device is added.
Create the device for the virtual LUN 0 using the DUMMY flag. This change
makes it possible to remove some special-casing in the INQUIRY code.
Link: https://lore.kernel.org/r/20210322200938.53300-3-k.shelekhin@yadro.com Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Konstantin Shelekhin <k.shelekhin@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This commit adds the DUMMY flag to the rd_mcp backend that forces a logical
unit to report itself as not connected device of an unknown type.
Essentially this allows users to create devices identical to the device for
the virtual LUN 0, making it possible to explicitly create a LUN 0 device
and configure its WWNs (e.g. vendor or product name).
Link: https://lore.kernel.org/r/20210322200938.53300-2-k.shelekhin@yadro.com Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Konstantin Shelekhin <k.shelekhin@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Pittman [Wed, 31 Mar 2021 18:16:56 +0000 (14:16 -0400)]
scsi: scsi_dh_alua: Prevent duplicate pg info print in alua_rtpg()
Due to the frequency that alua_rtpg() is called, the path group info print
within can print the same info multiple times in the logs, subsequent
prints adding no new information or value.
To fix, check stored values, only printing at alua attach/activate and if
any of the values change.
Link: https://lore.kernel.org/r/20210331181656.5046-1-jpittman@redhat.com Reviewed-by: David Jeffery <djeffery@redhat.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: John Pittman <jpittman@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Thu, 11 Mar 2021 20:17:54 +0000 (14:17 -0600)]
scsi: smartpqi: Update version to 2.1.8-045
Update version.
Link: https://lore.kernel.org/r/161549387469.25025.12859568843576080076.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Gerry Morong <gerry.morong@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:17:48 +0000 (14:17 -0600)]
scsi: smartpqi: Add new PCI IDs
Add support for newer hardware.
Link: https://lore.kernel.org/r/161549386882.25025.2594251735886014958.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Acked-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:17:42 +0000 (14:17 -0600)]
scsi: smartpqi: Correct system hangs when resuming from hibernation
Correct system hangs when resuming from hibernation after first successful
hibernation/resume cycle. Rare condition involving OFA.
Note: Suspend/resume is not supported on many platforms. It was originally
intended for workstations.
Link: https://lore.kernel.org/r/161549386295.25025.14555840632114761610.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Murthy Bhat [Thu, 11 Mar 2021 20:17:37 +0000 (14:17 -0600)]
scsi: smartpqi: Update enclosure identifier in sysfs
Update enclosure identifier field corresponding to physical devices in
lsscsi/sysfs.
During device add the SCSI devtype is filled in during slave_configure().
However, when pqi_scsi_update_device() runs (REGNEWD) the firmware returns
zero for the SCSI devtype field, and valid devtype is overwritten by
zero. Due to this, lsscsi output shows wrong values.
Link: https://lore.kernel.org/r/161549385708.25025.17234953506918043750.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:17:31 +0000 (14:17 -0600)]
scsi: smartpqi: Add additional logging for LUN resets
LUN resets can take longer to complete. Adding in more driver logging helps
show where the driver is in the reset process.
Add a timeout in pqi_device_wait_for_pending_io() to cap how long the
driver will wait for outstanding commands.
Link: https://lore.kernel.org/r/161549385119.25025.10366493975709358647.stgit@brunhilda Reviewed-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Murthy Bhat [Thu, 11 Mar 2021 20:17:25 +0000 (14:17 -0600)]
scsi: smartpqi: Update SAS initiator_port_protocols and target_port_protocols
Export valid sas initiator_port_protocols and target_port_protocols to
sysfs. Needed for lsscsi to show correct values.
Link: https://lore.kernel.org/r/161549384532.25025.1469409935400845385.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Murthy Bhat [Thu, 11 Mar 2021 20:17:19 +0000 (14:17 -0600)]
scsi: smartpqi: Add phy ID support for the physical drives
Display topology using PHY numbers. PHY (both local and remote) numbers
corresponding to physical drives are read from
BMIC_IDENTIFY_PHYSICAL_DEVICE.
Link: https://lore.kernel.org/r/161549383947.25025.16977895345376485056.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:17:13 +0000 (14:17 -0600)]
scsi: smartpqi: Convert snprintf() to scnprintf()
The entire Linux kernel has been slowly migrating from snprintf() to
scnprintf(), so we are doing our part. This article explains the rationale
for this change:
https: //lwn.net/Articles/69419/
Link: https://lore.kernel.org/r/161549383357.25025.12363435617789964291.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:17:07 +0000 (14:17 -0600)]
scsi: smartpqi: Fix driver synchronization issues
- Synchronize OFA and controller offline events. Prevent I/O during the
above conditions.
- Cleanup pqi_device_wait_for_pending_io() by checking the
device->scsi_cmds_outstanding instead of walking the device's list of
commands.
- Stop failing all I/O for all devices. This was causing OS to retry them,
delaying OFA.
- Clean up cache flush. The controller is checked for offline status in
lower level functions.
Link: https://lore.kernel.org/r/161549382770.25025.789855864026860170.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:17:01 +0000 (14:17 -0600)]
scsi: smartpqi: Update device scan operations
Change return type from EINPROGRESS to EBUSY to signal applications to
retry a REGNEWD if the driver cannot process the REGNEWD. Events such as
OFA, suspend, and shutdown return EINPROGRESS if a scan is currently
running. This prevents applications from immediately retrying REGNEWD.
Schedule a new REGNEWD if system low on memory.
Link: https://lore.kernel.org/r/161549382157.25025.16054784597622125373.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:16:55 +0000 (14:16 -0600)]
scsi: smartpqi: Update OFA management
OFA, Online Firmware Activation, allows users to update firmware without a
reboot.
- Change OFA setup to a worker thread
- Delay soft resets
- Add OFA event handler to allow FW to initiate OFA
- Add in-memory allocation to OFA events
- Update OFA buffer size calculations
- Add ability to cancel OFA events
- Update OFA quiesce/un-quiesce
- Prevent Kernel crashes while issuing ioctl during OFA
- Returned EBUSY for pass-through IOCTLs throughout all stages of OFA
- Add mutex to prevent parallel OFA updates.
Link: https://lore.kernel.org/r/161549381563.25025.2647205502550052197.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:16:49 +0000 (14:16 -0600)]
scsi: smartpqi: Update RAID bypass handling
Simplify AIO retry management by removing retry list and list
management. Need to retry is already set in the response status. Also
remove the bypass worker thread.
Accelerated I/O requests bypass the RAID engine and go directly to either
an HBA disk or to a physical component of a RAID volume.
Link: https://lore.kernel.org/r/161549380976.25025.11776487034357231156.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:16:44 +0000 (14:16 -0600)]
scsi: smartpqi: Update suspend/resume and shutdown
For suspend/resume and shutdown prevent: Controller events, any new I/O
requests, controller requests, REGNEWD, and reset operations.
Wait for any pending completions from the controller to complete to avoid
controller NMI events.
Link: https://lore.kernel.org/r/161549380398.25025.12266769502766103580.stgit@brunhilda Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:16:38 +0000 (14:16 -0600)]
scsi: smartpqi: Synchronize device resets with mutex
Remove some flags used to check for device resets already in
progress. Allow only 1 reset operation at a time for the host.
Link: https://lore.kernel.org/r/161549379810.25025.10194117431886743795.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:16:32 +0000 (14:16 -0600)]
scsi: smartpqi: Update soft reset management for OFA
Cleanup soft reset code for Online Firmware Activation (OFA). OFA allows
controller firmware updates without a reboot.
OFA updates require an on-line controller reset to activate the updated
firmware. There were some missing actions for some of the reset cases. The
controller is first set back to sis mode before returning to pqi mode.
Check to ensure the controller is in sis mode.
Release QRM memory (OFA buffer) on OFA error conditions. Clean up
controller state which can cause a kernel panic upon reboot after an
unsuccessful OFA.
Link: https://lore.kernel.org/r/161549379215.25025.10654441314249183621.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:16:26 +0000 (14:16 -0600)]
scsi: smartpqi: Update event handler
Change the data types for event_id and additional_event_id.
Link: https://lore.kernel.org/r/161549378628.25025.14338046567871170916.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:16:20 +0000 (14:16 -0600)]
scsi: smartpqi: Add support for wwid
WWID has been added to Report Physical LUNs in newer controller
firmware. The presence of this field is detected by a feature bit. Add
detection of this new feature and store the WWID when set.
Link: https://lore.kernel.org/r/161549378041.25025.3869709982357729841.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:16:14 +0000 (14:16 -0600)]
scsi: smartpqi: Remove timeouts from internal cmds
Remove timeouts for driver-initiated commands. Responses to internal
requests can take longer than hard coded timeout values and the driver will
still have an outstanding request that may complete in the future with no
context.
Link: https://lore.kernel.org/r/161549377451.25025.12306492868851801623.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:16:08 +0000 (14:16 -0600)]
scsi: smartpqi: Disable WRITE SAME for HBA NVMe disks
Controller does not support SCSI WRITE SAME for NVMe drives in HBA mode
Link: https://lore.kernel.org/r/161549376866.25025.5961694654342018260.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Current stream detection:
cat /sys/devices/pci0000:36/0000:36:00.0/0000:37:00.0/0000:38:00.0/0000:39:00.0/host2/scsi_host/host2/enable_stream_detection
1
Turn off stream detection:
echo 0 > /sys/devices/pci0000:36/0000:36:00.0/0000:37:00.0/0000:38:00.0/0000:39:00.0/host2/scsi_host/host2/enable_stream_detection
Turn on stream detection:
echo 1 > /sys/devices/pci0000:36/0000:36:00.0/0000:37:00.0/0000:38:00.0/0000:39:00.0/host2/scsi_host/host2/enable_stream_detection
Link: https://lore.kernel.org/r/161549376281.25025.1132304698441513738.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Thu, 11 Mar 2021 20:15:56 +0000 (14:15 -0600)]
scsi: smartpqi: Add stream detection
Enhance performance by adding sequential stream detection for RAID5/RAID6
sequential write requests. Reduce stripe lock contention with full-stripe
write operations.
There is one common stripe lock for each RAID volume that can be set by
either the RAID engine or the AIO engine. The AIO path has I/O request
sizes well below the stripe size resulting in many Read-Modify-Write
operations.
Sending the request to the RAID engine allows for coalescing requests into
full stripe operations resulting in reduced Read-Modify-Write operations.
Link: https://lore.kernel.org/r/161549375693.25025.2962141451773219796.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:15:50 +0000 (14:15 -0600)]
scsi: smartpqi: Align code with oob driver
Reduce differences between out-of-box driver and kernel.org driver. No
functional changes.
Link: https://lore.kernel.org/r/161549375094.25025.9268879575316758510.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:15:45 +0000 (14:15 -0600)]
scsi: smartpqi: Add support for long firmware version
Add support for new "long" firmware version which requires minor driver
changes to expose.
Link: https://lore.kernel.org/r/161549374508.25025.15467221395888158022.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:15:39 +0000 (14:15 -0600)]
scsi: smartpqi: Add support for BMIC sense feature cmd and feature bits
Determine support for supported features from BMIC sense feature command
instead of config table. Enable features such as: RAID 1/5/6 write
support, SATA wwid, and encryption.
Link: https://lore.kernel.org/r/161549373914.25025.7999816178098103135.stgit@brunhilda Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Thu, 11 Mar 2021 20:15:33 +0000 (14:15 -0600)]
scsi: smartpqi: Add support for RAID1 writes
Add RAID1 write IU and implement RAID1 write support. Change brand names
ADM/ADG to TRIPLE/RAID-6.
Link: https://lore.kernel.org/r/161549373324.25025.2441592111049564780.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Thu, 11 Mar 2021 20:15:27 +0000 (14:15 -0600)]
scsi: smartpqi: Add support for RAID5 and RAID6 writes
Add in new IU definition and implement support for RAID5 and RAID6 writes.
Link: https://lore.kernel.org/r/161549372734.25025.963261942897080281.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Thu, 11 Mar 2021 20:15:21 +0000 (14:15 -0600)]
scsi: smartpqi: Refactor scatterlist code
Factor out code common to all scatter-gather list building to prepare for
new AIO functionality. AIO (Accelerated I/O) requests go directly to disk
No functional changes.
Link: https://lore.kernel.org/r/161549372147.25025.9706613054649682229.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Thu, 11 Mar 2021 20:15:15 +0000 (14:15 -0600)]
scsi: smartpqi: Refactor aio submission code
Refactor aio submission code:
1. Break up function pqi_raid_bypass_submit_scsi_cmd()
into smaller functions.
2. Add common block (rmd - raid_map_data) to carry around into newly
added functions.
3. Prepare for new AIO functionality.
No functional changes.
Link: https://lore.kernel.org/r/161549371553.25025.8840958689316611074.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kevin Barnett [Thu, 11 Mar 2021 20:15:09 +0000 (14:15 -0600)]
scsi: smartpqi: Add support for new product ids
Add support for newer hardware by adding in a product identifier. This
identifier can then be used to check for the hardware generation.
Link: https://lore.kernel.org/r/161549370966.25025.2968242206975557607.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Murthy Bhat [Thu, 11 Mar 2021 20:15:03 +0000 (14:15 -0600)]
scsi: smartpqi: Correct request leakage during reset operations
While failing queued I/Os in TMF path, there was a request leak and hence
stale entries in request pool with ref count being non-zero. In shutdown
path we have a BUG_ON to catch stuck I/O either in firmware or in the
driver. The stale requests caused a system crash. The I/O request pool
leakage also lead to a significant performance drop.
Link: https://lore.kernel.org/r/161549370379.25025.12793264112620796062.stgit@brunhilda Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Thu, 11 Mar 2021 20:14:57 +0000 (14:14 -0600)]
scsi: smartpqi: Use host-wide tag space
Correct SCSI midlayer sending more requests than exposed host queue depth
causing firmware ASSERT and lockup issues by enabling host-wide tags.
Note: This also results in better performance.
Link: https://lore.kernel.org/r/161549369787.25025.8975999483518581619.stgit@brunhilda Suggested-by: Ming Lei <ming.lei@redhat.com> Suggested-by: John Garry <john.garry@huawei.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: message: mptlan: Replace one-element array with flexible-array member
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code
should always use "flexible array members"[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].
Refactor the code according to the use of a flexible-array member in struct
_SGE_TRANSACTION32 instead of one-element array.
Also, this helps with the ongoing efforts to enable -Warray-bounds by
fixing the following warning:
CC [M] drivers/message/fusion/mptlan.o
drivers/message/fusion/mptlan.c: In function ‘mpt_lan_sdu_send’:
drivers/message/fusion/mptlan.c:759:28: warning: array subscript 1 is above array bounds of ‘U32[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds]
759 | pTrans->TransactionDetails[1] = cpu_to_le32((mac[2] << 24) |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
scsi: message: fusion: Replace one-element array with flexible-array member
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code
should always use "flexible array members"[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].
Also, this helps with the ongoing efforts to enable -Warray-bounds by
fixing the following warning:
drivers/message/fusion/mptbase.c: In function ‘mptbase_reply’:
drivers/message/fusion/mptbase.c:7747:62: warning: array subscript 1 is above array bounds of ‘U32[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds]
7747 | ioc->events[idx].data[ii] = le32_to_cpu(pEventReply->Data[ii]);
./include/uapi/linux/byteorder/little_endian.h:34:51: note: in definition of macro ‘__le32_to_cpu’
34 | #define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
| ^
drivers/message/fusion/mptbase.c:7747:33: note: in expansion of macro ‘le32_to_cpu’
7747 | ioc->events[idx].data[ii] = le32_to_cpu(pEventReply->Data[ii]);
|
An old cleanup changed the array size from MAX_ADDR_LEN to unspecified in
the declaration, but now gcc-11 warns about this:
drivers/scsi/fcoe/fcoe_ctlr.c:1972:37: error: argument 1 of type ‘unsigned char[32]’ with mismatched bound [-Werror=array-parameter=]
1972 | u64 fcoe_wwn_from_mac(unsigned char mac[MAX_ADDR_LEN],
| ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
In file included from /git/arm-soc/drivers/scsi/fcoe/fcoe_ctlr.c:33:
include/scsi/libfcoe.h:252:37: note: previously declared as ‘unsigned char[]’
252 | u64 fcoe_wwn_from_mac(unsigned char mac[], unsigned int, unsigned int);
| ~~~~~~~~~~~~~~^~~~~
Change the type back to what the function definition uses.
Link: https://lore.kernel.org/r/20210322164702.957810-1-arnd@kernel.org Fixes: fdd78027fd47 ("[SCSI] fcoe: cleans up libfcoe.h and adds fcoe.h for fcoe module") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In this case, the code is entirely valid, as the string is properly
terminated, and the size argument is only there out of extra caution in
case it exceeds a page.
This cannot really happen here, so just simplify it to a sizeof().
Link: https://lore.kernel.org/r/20210322160253.4032422-10-arnd@kernel.org Fixes: afff0d2321ea ("scsi: lpfc: Add Buffer overflow check, when nvme_info larger than PAGE_SIZE") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Arnd Bergmann [Mon, 22 Mar 2021 10:33:09 +0000 (11:33 +0100)]
scsi: mvsas: Avoid -Wempty-body warning
Building with 'make W=1' shows a few harmless -Wempty-body warning for the
mvsas driver:
drivers/scsi/mvsas/mv_94xx.c: In function 'mvs_94xx_phy_reset':
drivers/scsi/mvsas/mv_94xx.c:278:63: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
278 | mv_dprintk("phy hard reset failed.\n");
| ^
drivers/scsi/mvsas/mv_sas.c: In function 'mvs_task_prep':
drivers/scsi/mvsas/mv_sas.c:723:57: error: suggest braces around empty body in an 'else' statement [-Werror=empty-body]
723 | SAS_ADDR(dev->sas_addr));
| ^
Change the empty dprintk() macros to no_printk(), which avoids this warning
and adds format string checking.
There are a couple of warnings in this driver when building with W=1:
drivers/message/fusion/mptbase.c: In function 'PrimeIocFifos':
drivers/message/fusion/mptbase.c:4608:65: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
4608 | "restoring 64 bit addressing\n", ioc->name));
| ^
drivers/message/fusion/mptbase.c:4633:65: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
4633 | "restoring 64 bit addressing\n", ioc->name));
The macros are slightly suboptimal since are not proper statements.
Change both versions to the usual "do { ... } while (0)" style to
make them more robust and avoid the warning.
Arnd Bergmann [Mon, 22 Mar 2021 10:25:43 +0000 (11:25 +0100)]
scsi: aic94xx: Avoid -Wempty-body warning
Building with 'make W=1' shows a harmless -Wempty-body warning:
drivers/scsi/aic94xx/aic94xx_init.c: In function 'asd_free_queues':
drivers/scsi/aic94xx/aic94xx_init.c:858:62: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
858 | ASD_DPRINTK("Uh-oh! Pending is not empty!\n");
Change the empty ASD_DPRINTK() macro to no_printk(), which avoids this
warning and adds format string checking.
Colin Ian King [Sat, 27 Mar 2021 23:06:50 +0000 (23:06 +0000)]
scsi: qedi: Remove redundant assignment to variable err
Variable err is assigned -ENOMEM followed by an error return path via label
err_udev that does not access the variable and returns with the -ENOMEM
error return code. The assignment to err is redundant and can be removed.
Lee Duncan [Tue, 23 Mar 2021 17:27:56 +0000 (10:27 -0700)]
scsi: fnic: Remove bogus ratelimit messages
Commit b43abcbbd5b1 ("scsi: fnic: Ratelimit printks to avoid flooding when
vlan is not set by the switch.i") added printk_ratelimit() in front of a
couple of debug-mode messages to reduce logging overrun when debugging the
driver. The code:
> if (printk_ratelimit())
> FNIC_FCS_DBG(KERN_DEBUG, fnic->lport->host,
> "Start VLAN Discovery\n");
ends up calling printk_ratelimit() quite often, triggering many kernel
messages about callbacks being supressed.
The fix is to decompose FNIC_FCS_DBG(), then change the order of checks so
that printk_ratelimit() is only called if driver debugging is enabled.
Quinn Tran [Mon, 29 Mar 2021 08:52:26 +0000 (01:52 -0700)]
scsi: qla2xxx: Fix mailbox recovery during PCIe error
For the mailbox thread that encounters a PCIe error, pause that thread
until PCIe link reset/recovery has completed to prevent the thread from
possibly unmapping any type of DMA resource that might be in progress.
Quinn Tran [Mon, 29 Mar 2021 08:52:22 +0000 (01:52 -0700)]
scsi: qla2xxx: Fix use after free in bsg
On bsg command completion, bsg_job_done() was called while qla driver
continued to access the bsg_job buffer. bsg_job_done() would free up
resources that ended up being reused by other task while the driver
continued to access the buffers. As a result, driver was reading garbage
data.
Quinn Tran [Mon, 29 Mar 2021 08:52:20 +0000 (01:52 -0700)]
scsi: qla2xxx: Fix stuck session
Session was stuck due to explicit logout to target timing out. The target
was in an unresponsive state. This timeout induced an error to the GNL
command from moving forward.
Link: https://lore.kernel.org/r/20210329085229.4367-4-njavali@marvell.com Tested-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Arun Easi [Mon, 29 Mar 2021 08:52:19 +0000 (01:52 -0700)]
scsi: qla2xxx: Add H:C:T info in the log message for fc ports
The host:channel:scsi_target_id information is helpful in matching an FC
port with a SCSI device, so add it. For initiator FC ports, a -1 would be
displayed for "target" part.
Link: https://lore.kernel.org/r/20210329085229.4367-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Gulam Mohamed [Thu, 25 Mar 2021 09:32:48 +0000 (09:32 +0000)]
scsi: iscsi: Fix race condition between login and sync thread
A kernel panic was observed due to a timing issue between the sync thread
and the initiator processing a login response from the target. The session
reopen can be invoked both from the session sync thread when iscsid
restarts and from iscsid through the error handler. Before the initiator
receives the response to a login, another reopen request can be sent from
the error handler/sync session. When the initial login response is
subsequently processed, the connection has been closed and the socket has
been released.
To fix this a new connection state, ISCSI_CONN_BOUND, is added:
- Set the connection state value to ISCSI_CONN_DOWN upon
iscsi_if_ep_disconnect() and iscsi_if_stop_conn()
- Set the connection state to the newly created value ISCSI_CONN_BOUND
after bind connection (transport->bind_conn())
- In iscsi_set_param(), return -ENOTCONN if the connection state is not
either ISCSI_CONN_BOUND or ISCSI_CONN_UP
Martin Wilck [Tue, 23 Mar 2021 21:24:31 +0000 (22:24 +0100)]
scsi: target: pscsi: Clean up after failure in pscsi_map_sg()
If pscsi_map_sg() fails, make sure to drop references to already allocated
bios.
Link: https://lore.kernel.org/r/20210323212431.15306-2-mwilck@suse.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Martin Wilck [Tue, 23 Mar 2021 21:24:30 +0000 (22:24 +0100)]
scsi: target: pscsi: Avoid OOM in pscsi_map_sg()
pscsi_map_sg() uses the variable nr_pages as a hint for bio_kmalloc() how
many vector elements to allocate. If nr_pages is < BIO_MAX_PAGES, it will
be reset to 0 after successful allocation of the bio.
If bio_add_pc_page() fails later for whatever reason, pscsi_map_sg() tries
to allocate another bio, passing nr_vecs = 0. This causes bio_add_pc_page()
to fail immediately in the next call. pci_map_sg() continues to allocate
zero-length bios until memory is exhausted and the kernel crashes with
OOM. This can be easily observed by exporting a SATA DVD drive via pscsi.
The target crashes as soon as the client tries to access the DVD LUN. In
the case I analyzed, bio_add_pc_page() would fail because the DVD device's
max_sectors_kb (128) was exceeded.
Avoid this by simply not resetting nr_pages to 0 after allocating the
bio. This way, the client receives an I/O error when it tries to send
requests exceeding the devices max_sectors_kb, and eventually gets it
right. The client must still limit max_sectors_kb e.g. by an udev rule if
(like in my case) the driver doesn't report valid block limits, otherwise
it encounters I/O errors.
Link: https://lore.kernel.org/r/20210323212431.15306-1-mwilck@suse.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Arnd Bergmann [Tue, 23 Mar 2021 12:54:23 +0000 (13:54 +0100)]
scsi: pm8001: Avoid -Wrestrict warning
On some configurations, gcc warns about overlapping source and destination
arguments to snprintf:
drivers/scsi/pm8001/pm8001_init.c: In function 'pm8001_request_msix':
drivers/scsi/pm8001/pm8001_init.c:977:3: error: 'snprintf' argument 4 may overlap destination object 'pm8001_ha' [-Werror=restrict]
977 | snprintf(drvname, len, "%s-%d", pm8001_ha->name, i);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/pm8001/pm8001_init.c:962:56: note: destination object referenced by 'restrict'-qualified argument 1 was declared here
962 | static u32 pm8001_request_msix(struct pm8001_hba_info *pm8001_ha)
| ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~
I first assumed this was a gcc bug, as that should not happen, but a
reduced test case makes it clear that this happens when the loop counter is
not bounded by the array size.
Help the compiler out by adding an explicit limit here to make the code
slightly more robust and avoid the warning.
Rasmus Villemoes [Wed, 10 Mar 2021 22:16:02 +0000 (23:16 +0100)]
scsi: bnx2i: Make bnx2i_process_iscsi_error() simpler and more robust
Instead of strcpy'ing into a stack buffer, just let additional_notice point
to a string literal living in .rodata. This is better in a few ways:
- Smaller .text - instead of gcc compiling the strcpys as a bunch of
immediate stores (effectively encoding the string literal in the
instruction stream), we only pay the price of storing the literal in
.rodata.
- Faster, because there's no string copying.
- Smaller stack usage (with my compiler, 72 bytes instead of 176 for the
sole caller, bnx2i_indicate_kcqe)
Moreover, it's currently possible for additional_notice[] to get used
uninitialized, so some random stack garbage would be passed to printk() -
in the worst case without any '\0' anywhere in those 64 bytes. That could
be fixed by initializing additional_notice[0], but the same is achieved
here by initializing the new pointer variable to "".
Also give the message pointer a similar treatment - there's no point making
temporary copies on the stack of those two strings.
Jia-Ju Bai [Mon, 8 Mar 2021 03:30:24 +0000 (19:30 -0800)]
scsi: qedi: Fix error return code of qedi_alloc_global_queues()
When kzalloc() returns NULL to qedi->global_queues[i], no error return code
of qedi_alloc_global_queues() is assigned. To fix this bug, status is
assigned with -ENOMEM in this case.
Link: https://lore.kernel.org/r/20210308033024.27147-1-baijiaju1990@gmail.com Fixes: ace7f46ba5fd ("scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework.") Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Acked-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Sat, 20 Mar 2021 23:23:58 +0000 (16:23 -0700)]
scsi: qla2xxx: Always check the return value of qla24xx_get_isp_stats()
This patch fixes the following Coverity warning:
CID 361199 (#1 of 1): Unchecked return value (CHECKED_RETURN)
3. check_return: Calling qla24xx_get_isp_stats without checking return
value (as is done elsewhere 4 out of 5 times).
Link: https://lore.kernel.org/r/20210320232359.941-7-bvanassche@acm.org Cc: Quinn Tran <qutran@marvell.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: Himanshu Madhani <himanshu.madhani@oracle.com> Cc: Daniel Wagner <dwagner@suse.de> Cc: Lee Duncan <lduncan@suse.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch fixes the following Coverity complaint:
CID 177490 (#1 of 1): Unused value (UNUSED_VALUE)
assigned_value: Assigning value from opcode & 0xffffff7fU to opcode
here, but that stored value is overwritten before it can be used.
Link: https://lore.kernel.org/r/20210320232359.941-6-bvanassche@acm.org Cc: Quinn Tran <qutran@marvell.com> Cc: Mike Christie <michael.christie@oracle.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Sat, 20 Mar 2021 23:23:56 +0000 (16:23 -0700)]
scsi: qla2xxx: Suppress Coverity complaints about dseg_r*
Change dseq_rq and dseg_rsp from scalar structure members into
single-element arrays such that Coverity does not complain about the
(*cur_dsd)++ statement in append_dsd64().
Link: https://lore.kernel.org/r/20210320232359.941-5-bvanassche@acm.org Cc: Quinn Tran <qutran@marvell.com> Cc: Mike Christie <michael.christie@oracle.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Sat, 20 Mar 2021 23:23:53 +0000 (16:23 -0700)]
scsi: Revert "qla2xxx: Make sure that aborted commands are freed"
Calling vha->hw->tgt.tgt_ops->free_cmd() from qlt_xmit_response() is wrong
since the command for which a response is sent must remain valid until the
SCSI target core calls .release_cmd(). It has been observed that the
following scenario triggers a kernel crash:
- transport_handle_queue_full() tries to retransmit the response
Fix this crash by reverting the patch that introduced it.
Link: https://lore.kernel.org/r/20210320232359.941-2-bvanassche@acm.org Fixes: 0dcec41acb85 ("scsi: qla2xxx: Make sure that aborted commands are freed") Cc: Quinn Tran <qutran@marvell.com> Cc: Mike Christie <michael.christie@oracle.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Tyrel Datwyler [Fri, 19 Mar 2021 20:50:29 +0000 (14:50 -0600)]
scsi: ibmvfc: Make ibmvfc_wait_for_ops() MQ aware
During MQ enablement of the ibmvfc driver ibmvfc_wait_for_ops() was
missed. This function is responsible for waiting on commands to complete
that match a certain criteria such as LUN or cancel key. The implementation
as is only scans the CRQ for events ignoring any sub-queues and as a result
will exit successfully without doing anything when operating in MQ
channelized mode.
Check the MQ and channel use flags to determine which queues are
applicable, and scan each queue accordingly. Note in MQ mode SCSI commands
are only issued down sub-queues and the CRQ is only used for driver
specific management commands. As such the CRQ events are ignored when
operating in MQ mode with channels.
Link: https://lore.kernel.org/r/20210319205029.312969-3-tyreld@linux.ibm.com Fixes: 9000cb998bcf ("scsi: ibmvfc: Enable MQ and set reasonable defaults") Reviewed-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Tyrel Datwyler [Fri, 19 Mar 2021 20:50:28 +0000 (14:50 -0600)]
scsi: ibmvfc: Fix potential race in ibmvfc_wait_for_ops()
For various EH activities the ibmvfc driver uses ibmvfc_wait_for_ops() to
wait for the completion of commands that match a given criteria be it
cancel key, or specific LUN. With recent changes commands are completed
outside the lock in bulk by removing them from the sent list and adding
them to a private completion list. This introduces a potential race in
ibmvfc_wait_for_ops() since the criteria for a command to be outstanding is
no longer simply being on the sent list, but instead not being on the free
list.
Avoid this race by scanning the entire command event pool and checking that
any matching command that ibmvfc needs to wait on is not already on the
free list.
Link: https://lore.kernel.org/r/20210319205029.312969-2-tyreld@linux.ibm.com Fixes: 1f4a4a19508d ("scsi: ibmvfc: Complete commands outside the host/queue lock") Reviewed-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>