[RESEND PATCH] nvme: explicitly use normal NVMe error handling when appropriate
Meneghini, John
John.Meneghini at netapp.com
Thu Aug 13 23:23:47 EDT 2020
On 8/13/20, 10:48 AM, "Mike Snitzer" <snitzer at redhat.com> wrote:
Commit 764e9332098c0 ("nvme-multipath: do not reset on unknown
status"), among other things, fixed NVME_SC_CMD_INTERRUPTED error
handling by changing multipathing's nvme_failover_req() to short-circuit
path failover and then fallback to NVMe's normal error handling (which
takes care of NVME_SC_CMD_INTERRUPTED).
This detour through native NVMe multipathing code is unwelcome because
it prevents NVMe core from handling NVME_SC_CMD_INTERRUPTED independent
of any multipathing concerns.
Introduce nvme_status_needs_local_error_handling() to prioritize
non-failover retry, when appropriate, in terms of normal NVMe error
handling. nvme_status_needs_local_error_handling() will naturely evolve
to include handling of any other errors that normal error handling must
be used for.
How is this any better than blk_path_error()?
nvme_failover_req()'s ability to fallback to normal NVMe error handling
has been preserved because it may be useful for future NVME_SC that
nvme_status_needs_local_error_handling() hasn't been trained for yet.
Signed-off-by: Mike Snitzer <snitzer at redhat.com>
---
drivers/nvme/host/core.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 88cff309d8e4..be749b690af7 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -252,6 +252,16 @@ static inline bool nvme_req_needs_retry(struct request *req)
return true;
}
+static inline bool nvme_status_needs_local_error_handling(u16 status)
+{
+ switch (status & 0x7ff) {
+ case NVME_SC_CMD_INTERRUPTED:
+ return true;
+ default:
+ return false;
+ }
+}
I assume that what you mean by nvme_status_needs_local_error_handling is - do you want the nvme core
driver to handle the command retry.
If this is true, I don't think this function will ever work correctly because,. as discussed, whether or
not the command needs to be retried has nothing to do with the NVMe Status Code Field itself, it
has to so with the DNR bit, the CRD field, and the Status Code Type field - in that order.
+
static void nvme_retry_req(struct request *req)
{
struct nvme_ns *ns = req->q->queuedata;
@@ -270,7 +280,8 @@ static void nvme_retry_req(struct request *req)
void nvme_complete_rq(struct request *req)
{
- blk_status_t status = nvme_error_status(nvme_req(req)->status);
+ u16 nvme_status = nvme_req(req)->status;
+ blk_status_t status = nvme_error_status(nvme_status);
trace_nvme_complete_rq(req);
@@ -280,7 +291,8 @@ void nvme_complete_rq(struct request *req)
nvme_req(req)->ctrl->comp_seen = true;
if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
- if ((req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
+ if (!nvme_status_needs_local_error_handling(nvme_status) &&
This defeats the nvme-multipath logic by inserting a second evaluation of the NVMe Status Code into the retry logic.
This is basically another version of blk_path_error().
In fact, in your case REQ_NVME_MPATH is probably not set, so I don't see what difference this would make at all.
/John
+ (req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
return;
if (!blk_queue_dying(req->q)) {
--
2.18.0
More information about the Linux-nvme
mailing list