[PATCH 1/2 V2] nvme-rdma: Fix race between queue timeout and error recovery

Bart Van Assche Bart.VanAssche at wdc.com
Sun Apr 8 16:47:44 PDT 2018


On Mon, 2018-04-09 at 00:02 +0300, Sagi Grimberg wrote:
> But we first update aborted_gstate (with interrupts disabled), sync srcu
> and only them terminate expired requests. So I still don't understand
> how we can end up completing a request twice.

There is a bug in the current blk-mq timeout mechanism that can cause both a
regular completion and a completion due to a timeout to occur. I will repost
tomorrow the patch that I came up with to fix that issue. See also "[PATCH
v2] blk-mq: Fix race between resetting the timer and completion handling"
(https://marc.info/?l=linux-block&m=151796816127318).

Bart.


More information about the Linux-nvme mailing list