[PATCH] nvme: restore use of blk_path_error() in nvme_complete_rq()

Thu Aug 6 18:42:55 EDT 2020

On 8/6/20, 3:20 PM, "Mike Snitzer" <snitzer at redhat.com> wrote:

    From: Mike Snitzer <snitzer at redhat.com>
    Date: Thu, 2 Jul 2020 01:43:27 -0400
    Subject: [PATCH] nvme: restore use of blk_path_error() in nvme_complete_rq()

    Commit 764e9332098c0 ("nvme-multipath: do not reset on unknown
    status") removed NVMe's use blk_path_error() -- presummably because
    nvme_failover_req() was modified to return whether a command should be
    retried or not.

Yes, that was a major part of this patch.  nvme_failover_req() was completely 
reworked and fixed because it was badly broken. 

    By not using blk_path_error() there is serious potential for
    regression for how upper layers (e.g. DM multipath) respond to NVMe's
    error conditions.  This has played out now due to commit 35038bffa87
    ("nvme: Translate more status codes to blk_status_t").  Had NVMe
    continued to use blk_path_error() it too would not have retried an
    NVMe command that got NVME_SC_CMD_INTERRUPTED.

I'm not sure others would consider it broken.  The idea was to get the blk_path_error()
logic out of the core nvme driver completion routine. 

    Fix this potential for NVMe error handling regression, possibly
    outside NVMe, by restoring NVMe's use of blk_path_error().

The point is: blk_path_error() has nothing to do with NVMe errors.
This is dm-multipath logic stuck in the middle of the NVMe error 
handling code.

C symbol: blk_path_error

  File        Function             Line
0 dm-mpath.c  multipath_end_io     1606 if (error && blk_path_error(error)) {
1 dm-mpath.c  multipath_end_io_bio 1646 if (!*error || !blk_path_error(*error))
2 blk_types.h blk_path_error        118 static inline bool blk_path_error(blk_status_t error)

    Fixes: 764e9332098c0 ("nvme-multipath: do not reset on unknown status")
    Cc: stable at vger.kerneel.org
    Signed-off-by: Mike Snitzer <snitzer at redhat.com>
    ---
    drivers/nvme/host/core.c | 9 +++++++--
    1 file changed, 7 insertions(+), 2 deletions(-)

    diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
    index 6585d57112ad..072f629da4d8 100644
    --- a/drivers/nvme/host/core.c
    +++ b/drivers/nvme/host/core.c
    @@ -290,8 +290,13 @@ void nvme_complete_rq(struct request *req)
                    nvme_req(req)->ctrl->comp_seen = true;

            if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
    -               if ((req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
    -                       return;
    +               if (blk_path_error(status)) {
    +                       if (req->cmd_flags & REQ_NVME_MPATH) {
    +                               if (nvme_failover_req(req))
    +                                       return;
    +                               /* fallthru to normal error handling */
    +                       }
    +               }

This would basically undo the patch Hannes, Christoph, and I put together in
commit 764e9332098c0.  This patch greatly simplified and improved the
whole nvme_complete_rq() code path, and I don't support undoing that change.

If you want blk_path_error() to handle any new NVMe error status differently,  I would suggest creating a new 
BLK_STS code to translate that NVMe status and add it to the one place in nvme_complete_rq() designed to
do the NVMe status to BLK_STS translation:  nvme_error_status().  Then you can change  your application specific 
error handling code to handle the new BLK_STS code appropriately, further up the stack.  Don't stick your
application specific error handling logic into the middle of the nvme driver's nvme_complete_rq routine.

/John
                    if (!blk_queue_dying(req->q)) {
                            nvme_retry_req(req);
    --
    2.18.0