[PATCH] nvme-multipath: fix io accounting on failover

Wed May 22 07:18:10 PDT 2024

On Wed, May 22, 2024 at 06:32:11PM +0530, Nilay Shroff wrote:
> 
> 
> On 5/21/24 23:37, Keith Busch wrote:
> > From: Keith Busch <kbusch at kernel.org>
> > 
> > There are io stats accounting that needs to be handled, so don't call
> > blk_mq_end_request() directly. Use the existing nvme_end_req() helper
> > that already handles everything.
> > 
> The changes look good however I have a question about why do we retry an IO
> when that IO is cancelled? For instance, when a multipath IO request is cancelled 
> (from nvme_cancel_request()) we re-queue the bio in nvme_failover_req().
> Similarly, for non-multipath request, we do retry request in nvme_retry_req()
> until retries for a request are maxed out by nvme_max_retries. So wouldn't it be 
> appropriate to drop the cancelled request instead of retrying? 
> 
> However, I do understand retrying a request on a different path when we got the 
> request completion status specifying the path related error.

A cancelled request just means the host thinks the target failed to
produce a response. It doesn't mean the host stopped caring about the
command; the host still wants it to succeed, but determined corrective
action is needed to reclaim and resubmit the command.