[PATCH 2/5] NVMe: Use a retryable error code on reset

Wed Dec 30 10:09:08 PST 2015

On Wed, Dec 30, 2015 at 09:52:32AM -0800, Christoph Hellwig wrote:
> On Wed, Dec 30, 2015 at 10:27:48AM -0700, Keith Busch wrote:
> > Use a fake status that can potentially be retried on reset.
> 
> I thought I only added this to make you happy, so I'm fine with
> changing it.
> 
> But I don't understand why NVME_SC_CANCELLED isn't retryable,
> nvme_req_needs_retry doesn't treat negative error codes special.
> If it did we'd need to fix it as we want to be able to rety
> the other cases where it is returned as well.

nvme_req_needs_retry checks NVME status bit "DNR", which happens to be
set with a negative return code.

> Which btw brings me to a bug I noticed a while ago but didn't
> have time to fully understand: nvme_req_needs_retry looks
> at req->start_time vs req->timeout, but that means we'll
> fail any command that times out, even if we resubmit it
> after a controller reset.  Should we really have that clause
> in there?

Yeah, a timed out command isn't retryable here. The check is to allow
requeuing when a controller reset occured for reasons other than
a timeout.

A long time ago, we had total command lifetime timeout value just to
allow timed out commands to requeue after a reset, but that was removed
with the blk-mq conversion. I didn't think it was a big deal, though. The
amount of time it takes to trigger a reset is 60 seconds by default, so
that's a heck of a long time to wait just to requeue the IO. Some users
I spoke with prefered to just fail the command sooner, so I didn't make
any attempt to change this behavior.