[PATCH 4/4] nvme: redirect commands on dying queue

Wed Aug 19 06:01:46 EDT 2020

On 2020/8/18 15:11, Christoph Hellwig wrote:
> From: Chao Leng <lengchao at huawei.com>
> 
> If a command send through nvme-multipath failed on a dying queue, resend it
> on another path.
> 
> Signed-off-by: Chao Leng <lengchao at huawei.com>
> [hch: rebased on top of the completion refactoring]
> Signed-off-by: Christoph Hellwig <hch at lst.de>
> Reviewed-by: Sagi Grimberg <sagi at grimberg.me>
> Reviewed-by: Mike Snitzer <snitzer at redhat.com>
> ---
>   drivers/nvme/host/core.c | 9 +++++----
>   1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 36bb0fe9c7f6f8..a6785b86359fab 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -274,13 +274,14 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
>   		return COMPLETE;
>   
>   	if (req->cmd_flags & REQ_NVME_MPATH) {
> -		if (nvme_is_path_error(nvme_req(req)->status))
> +		if (nvme_is_path_error(nvme_req(req)->status) ||
> +		    blk_queue_dying(req->q))
>   			return FAILOVER;
> +	} else {
> +		if (blk_queue_dying(req->q))
> +			return COMPLETE;
>   	}
>   
> -	if (blk_queue_dying(req->q))
> -		return COMPLETE;
> -
>   	return RETRY;
>   }

Suggestion:
Do not use blk_noretry_request. The local retry mechanism, which is
defined by nvme protocol, is conflicted with REQ_FAILFAST_TRANSPORT.
blk_noretry_request is not a good choice for nvme, even for SCSI,
this is not a good choice too, so SCSI does not use this MACRO.
We can seperate REQ_FAILFAST_TRANSPORT with other FAILFAST flag.
For REQ_FAILFAST_TRANSPORT, do local retry for non path error.
For other FAILFAST flag, complete request with error code immediately.
---
  drivers/nvme/host/core.c | 12 +++++++++++-
  1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index a6785b86359f..cfd870780353 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -257,6 +257,16 @@ static void nvme_retry_req(struct request *req)
         blk_mq_delay_kick_requeue_list(req->q, delay);
  }

+static inline bool nvme_noretry_request(struct request *req)
+{
+       if (req->cmd_flags & (REQ_FAILFAST_DRIVER | REQ_FAILFAST_DEV))
+               return true;
+       if ((req->cmd_flags & REQ_FAILFAST_TRANSPORT) &&
+           nvme_is_path_error(nvme_req(req)->status))
+               return true;
+       return false;
+}
+
  enum nvme_disposition {
         COMPLETE,
         RETRY,
@@ -268,7 +278,7 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
         if (likely(nvme_req(req)->status == 0))
                 return COMPLETE;

-       if (blk_noretry_request(req) ||
+       if (nvme_noretry_request(req) ||
             (nvme_req(req)->status & NVME_SC_DNR) ||
             nvme_req(req)->retries >= nvme_max_retries)
                 return COMPLETE;
--