[PATCH 1/2] nvme-rdma: Fix race between queue timeout and error recovery

Sagi Grimberg sagi at grimberg.me
Sun Apr 8 04:04:22 PDT 2018


> Please send an introduction cover letter explaining what issue you've
> triggered and your overall design.

The commit log is actually wrong... We don't complete the request in two
places, the issue is that we need to make sure to unmap user buffer
before completing the request in case of a timeout. I sent this patch
to a bug report on the list and this is what it is designed to do.

Given that we already simply schedule error recovery, we will fail it
there, after we drain the queue pair, so the choice is to reset the
timer for it in the timeout callout.

We could alternatively invalidate the rkey in the timeout callout, but
that won't work with the unsafe rkey mode.



More information about the Linux-nvme mailing list