git.proxmox.com Git - mirror_ubuntu-focal-kernel.git/commit

author	Sagi Grimberg <sagi@grimberg.me>
	Wed, 29 Jul 2020 09:36:03 +0000 (02:36 -0700)
committer	Stefan Bader <stefan.bader@canonical.com>
	Mon, 9 Nov 2020 13:46:43 +0000 (14:46 +0100)
commit	885764faa01aa5a6c688eefc6f48dff22dd7d0e3
tree	eeb08f6fba6b540c0762dc7a04ba0b2b6ca8774c	tree
parent	41a1477535d46c58559565f7bee463ef05e985be	commit \| diff

nvme-rdma: fix timeout handler

BugLink: https://bugs.launchpad.net/bugs/1896824
[ Upstream commit 0475a8dcbcee92a5d22e40c9c6353829fc6294b8 ]

When a request times out in a LIVE state, we simply trigger error
recovery and let the error recovery handle the request cancellation,
however when a request times out in a non LIVE state, we make sure to
complete it immediately as it might block controller setup or teardown
and prevent forward progress.

However tearing down the entire set of I/O and admin queues causes
freeze/unfreeze imbalance (q->mq_freeze_depth) because and is really
an overkill to what we actually need, which is to just fence controller
teardown that may be running, stop the queue, and cancel the request if
it is not already completed.

Now that we have the controller teardown_lock, we can safely serialize
request cancellation. This addresses a hang caused by calling extra
queue freeze on controller namespaces, causing unfreeze to not complete
correctly.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>