]> git.proxmox.com Git - mirror_ubuntu-hirsute-kernel.git/commit
xprtrdma: Fix occasional transport deadlock
authorChuck Lever <chuck.lever@oracle.com>
Wed, 19 Jun 2019 14:32:48 +0000 (10:32 -0400)
committerAnna Schumaker <Anna.Schumaker@Netapp.com>
Tue, 9 Jul 2019 14:30:16 +0000 (10:30 -0400)
commit05eb06d86685e7d9dac60e6bbb46d7f4c30b056e
tree8f04bce3b88e16afd6912e44681b80752721c171
parent1310051c720a83c5717658bcbff710b260f2bff9
xprtrdma: Fix occasional transport deadlock

Under high I/O workloads, I've noticed that an RPC/RDMA transport
occasionally deadlocks (IOPS goes to zero, and doesn't recover).
Diagnosis shows that the sendctx queue is empty, but when sendctxs
are returned to the queue, the xprt_write_space wake-up never
occurs. The wake-up logic in rpcrdma_sendctx_put_locked is racy.

I noticed that both EMPTY_SCQ and XPRT_WRITE_SPACE are implemented
via an atomic bit. Just one of those is sufficient. Removing
EMPTY_SCQ in favor of the generic bit mechanism makes the deadlock
un-reproducible.

Without EMPTY_SCQ, rpcrdma_buffer::rb_flags is no longer used and
is therefore removed.

Unfortunately this patch does not apply cleanly to stable. If
needed, someone will have to port it and test it.

Fixes: 2fad659209d5 ("xprtrdma: Wait on empty sendctx queue")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
include/trace/events/rpcrdma.h
net/sunrpc/xprtrdma/frwr_ops.c
net/sunrpc/xprtrdma/rpc_rdma.c
net/sunrpc/xprtrdma/verbs.c
net/sunrpc/xprtrdma/xprt_rdma.h