[PATCH] nvme: explicitly use normal NVMe error handling when appropriate
Mike Snitzer
snitzer at redhat.com
Mon Aug 10 13:22:09 EDT 2020
On Mon, Aug 10 2020 at 10:36am -0400,
Mike Snitzer <snitzer at redhat.com> wrote:
> On Fri, Aug 07 2020 at 7:35pm -0400,
> Sagi Grimberg <sagi at grimberg.me> wrote:
>
> >
> > >>Hey Mike,
...
> > >I think NVMe can easily fix this by having an earlier stage of checking,
> > >e.g. nvme_local_retry_req(), that shortcircuits ever getting to
> > >higher-level multipathing consideration (be it native NVMe or DM
> > >multipathing) for cases like NVME_SC_CMD_INTERRUPTED.
> > >To be clear: the "default" case of nvme_failover_req() that returns
> > >false to fallback to NVMe's "local" normal NVMe error handling -- that
> > >can stay.. but a more explicit handling of cases like
> > >NVME_SC_CMD_INTERRUPTED should be added to a nvme_local_retry_req()
> > >check that happens before nvme_failover_req() in nvme_complete_rq().
> >
> > I don't necessarily agree with having a dedicated nvme_local_retry_req().
> > a request that isn't failed over, goes to local error handling (retry or
> > not). I actually think that just adding the condition to
> > nvme_complete_req and having nvme_failover_req reject it would work.
> >
> > Keith?
>
> I think that is basically what I'm thinking too.
From: Mike Snitzer <snitzer at redhat.com>
Subject: nvme: explicitly use normal NVMe error handling when appropriate
Commit 764e9332098c0 ("nvme-multipath: do not reset on unknown
status"), among other things, fixed NVME_SC_CMD_INTERRUPTED error
handling by changing multipathing's nvme_failover_req() to short-circuit
path failover and then fallback to NVMe's normal error handling (which
takes care of NVME_SC_CMD_INTERRUPTED).
This detour through native NVMe multipathing code is unwelcome because
it prevents NVMe core from handling NVME_SC_CMD_INTERRUPTED independent
of any multipathing concerns.
Introduce nvme_status_needs_local_error_handling() to prioritize
non-failover retry, when appropriate, in terms of normal NVMe error
handling. nvme_status_needs_local_error_handling() will naturely evolve
to include handling of any other errors that normal error handling must
be used for.
nvme_failover_req()'s ability to fallback to normal NVMe error handling
has been preserved because it may be useful for future NVME_SC that
nvme_status_needs_local_error_handling() hasn't yet been trained for.
Signed-off-by: Mike Snitzer <snitzer at redhat.com>
---
drivers/nvme/host/core.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 88cff309d8e4..be749b690af7 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -252,6 +252,16 @@ static inline bool nvme_req_needs_retry(struct request *req)
return true;
}
+static inline bool nvme_status_needs_local_error_handling(u16 status)
+{
+ switch (status & 0x7ff) {
+ case NVME_SC_CMD_INTERRUPTED:
+ return true;
+ default:
+ return false;
+ }
+}
+
static void nvme_retry_req(struct request *req)
{
struct nvme_ns *ns = req->q->queuedata;
@@ -270,7 +280,8 @@ static void nvme_retry_req(struct request *req)
void nvme_complete_rq(struct request *req)
{
- blk_status_t status = nvme_error_status(nvme_req(req)->status);
+ u16 nvme_status = nvme_req(req)->status;
+ blk_status_t status = nvme_error_status(nvme_status);
trace_nvme_complete_rq(req);
@@ -280,7 +291,8 @@ void nvme_complete_rq(struct request *req)
nvme_req(req)->ctrl->comp_seen = true;
if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
- if ((req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
+ if (!nvme_status_needs_local_error_handling(nvme_status) &&
+ (req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
return;
if (!blk_queue_dying(req->q)) {
--
2.18.0
More information about the Linux-nvme
mailing list