nvme: restore use of blk_path_error() in nvme_complete_rq()

Mon Aug 10 11:06:10 EDT 2020

On Mon, Aug 10 2020 at  8:43am -0400,
Christoph Hellwig <hch at infradead.org> wrote:

> Just returning from my vacation, and I'm really surprised about
> this discussion.

Welcome back!

FYI, I got roped into this stuff again due to reports of regression from
Red Hat and partner QE testing of latest RHEL 8.3 stuff.

I have been away from all things NVMe for a while so apologies for my
misfire that dwelled on nvme_complete_rq()'s blk_path_error() call being
removed.

> If you want to fix a device mapper / block layer interaction you do
> not change nvme code to use suboptimal interfaces.
> 
> One of the reasons for the nvme multipath design was to allow for
> tighter integration with protocol features, and making nvme do a detour
> through the block layer for no good reason at all does not help with
> that at all.

Yes, reinstating blk_path_error() was a distraction.  Still an
outstanding point to discuss but it can be pushed to the
back-back-burner.

> And if you want to support the TP that added the new command interrupted
> status code please read through it - the status code itself is just a
> small part of it, and none of the changes proposed in these discussions
> would lead to a proper implementation.

Think you likely glossed over aspects of my exchange with Sagi.  That's
fine.  I don't fault you in any way for not wanting to labor over this
thread to this  point! ;)

Really I shouldn't _need_ to be the wiser about all this inherently
local NVMe error handling, which happens to be increasingly complex,
that seemingly has nothing to do with multipathing.

NVMe should handle the command interrupted status code, ANA and now
ACRE in a way where it isn't conflated to detour through
nvme_failover_req() et al.

SO that is what I hope to rectify with some simple patches.

Mike