[PATCH v2 1/2] nvme-rdma: avoid request double completion for concurrent nvme_rdma_timeout
Israel Rukshin
israelr at nvidia.com
Thu Jan 14 11:34:24 EST 2021
On 1/14/2021 11:09 AM, Chao Leng wrote:
> A crash happens when inject completing request long time(nearly 30s).
> Each name space has a request queue, when inject completing request long
> time, multi request queues may have time out requests at the same time,
> nvme_rdma_timeout will execute concurrently. Multi requests in different
> request queues may be queued in the same rdma queue, multi
> nvme_rdma_timeout may call nvme_rdma_stop_queue at the same time.
> The first nvme_rdma_timeout will clear NVME_RDMA_Q_LIVE and continue
> stopping the rdma queue(drain qp), but the others check NVME_RDMA_Q_LIVE
> is already cleared, and then directly complete the requests, complete
> request before the qp is fully drained may lead to a use-after-free
> condition.
>
> Add a multex lock to serialize nvme_rdma_stop_queue.
Looks good to me.
I tested this patch at our regression.
Tested-by: Israel Rukshin <israelr at nvidia.com>
Reviewed-by: Israel Rukshin <israelr at nvidia.com>
More information about the Linux-nvme
mailing list