[PATCH 1/5] nvme: add NVME_REQ_CANCELLED flag in nvme_cancel_request()

Sagi Grimberg sagi at grimberg.me
Thu Jan 28 03:15:48 EST 2021


> NVME_REQ_CANCELLED is translated into -EINTR in nvme_submit_sync_cmd(),
> so we should be setting this flags during nvme_cancel_request() to
> ensure that the callers to nvme_submit_sync_cmd() will get the correct
> error code when the controller is reset.

We will have an issue here. Because now nvme_validate_ns will return
EINTR, which means that now scans failing to validate during resets
will now end up incorrectly removing the namespace:

--
out:
         /*
          * Only remove the namespace if we got a fatal error back from the
          * device, otherwise ignore the error and just move on.
          *
          * TODO: we should probably schedule a delayed retry here.
          */
         if (ret && ret != -ENOMEM && !(ret > 0 && !(ret & NVME_SC_DNR)))
                 nvme_ns_remove(ns);
--

Did you check resets loop with scans with this?

If we go down this path (we already had this discussion in the
past), we need to decide on either of:
- Don't remove the namespace for any of the return codes from
   nvme_submit_sync_cmd (understanding that this may be just a
   transport issue).
- Or, return the raw positive nvme status (which doesn't have
   the DNR bit set) and make sure we return ret codes on top.

But this change alone, re-triggers a condition that during
resets a namespace is incorrectly removed.



More information about the Linux-nvme mailing list