]> git.proxmox.com Git - mirror_ubuntu-disco-kernel.git/commit
scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO
authorXiang Chen <chenxiang66@hisilicon.com>
Thu, 28 Feb 2019 14:50:58 +0000 (22:50 +0800)
committerSeth Forshee <seth.forshee@canonical.com>
Mon, 1 Apr 2019 19:08:07 +0000 (14:08 -0500)
commit0a033de06edf4c618f498afa9986b7aa19e797db
tree4a1dda2912850195d8e21d4191e19f0172626bcc
parenta4cacdebd197767ff8a39c94f848eb62da1eb416
scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO

BugLink: https://bugs.launchpad.net/bugs/1822385
For internal IO and SMP IO, there is a time-out timer for them. In the
timer handler, it checks whether IO is done according to the flag
task->task_state_lock.

There is an issue which may cause system suspended: internal IO or SMP IO
is sent, but at that time because of hardware exception (such as inject
2Bit ECC error), so IO is not completed and also not timeout. But, at that
time, the SAS controller reset occurs to recover system. It will release
the resource and set the status of IO to be SAS_TASK_STATE_DONE, so when IO
timeout, it will never complete the completion of IO and wait for ever.

[  729.123632] Call trace:
[  729.126791] [<ffff00000808655c>] __switch_to+0x94/0xa8
[  729.133106] [<ffff000008d96e98>] __schedule+0x1e8/0x7fc
[  729.138975] [<ffff000008d974e0>] schedule+0x34/0x8c
[  729.144401] [<ffff000008d9b000>] schedule_timeout+0x1d8/0x3cc
[  729.150690] [<ffff000008d98218>] wait_for_common+0xdc/0x1a0
[  729.157101] [<ffff000008d98304>] wait_for_completion+0x28/0x34
[  729.165973] [<ffff000000dcefb4>] hisi_sas_internal_task_abort+0x2a0/0x424 [hisi_sas_test_main]
[  729.176447] [<ffff000000dd18f4>] hisi_sas_abort_task+0x244/0x2d8 [hisi_sas_test_main]
[  729.185258] [<ffff000008971714>] sas_eh_handle_sas_errors+0x1c8/0x7b8
[  729.192391] [<ffff000008972774>] sas_scsi_recover_host+0x130/0x398
[  729.199237] [<ffff00000894d8a8>] scsi_error_handler+0x148/0x5c0
[  729.206009] [<ffff0000080f4118>] kthread+0x10c/0x138
[  729.211563] [<ffff0000080855dc>] ret_from_fork+0x10/0x18

To solve the issue, callback function task_done of those IOs need to be
called when on SAS controller reset.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 4790595723d4b833b18c994973d39f9efb842887)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
drivers/scsi/hisi_sas/hisi_sas_main.c