[PATCH] nvme-multipath: fix io accounting on failover

Thu May 23 00:00:55 PDT 2024


On 5/22/24 19:48, Keith Busch wrote:
> On Wed, May 22, 2024 at 06:32:11PM +0530, Nilay Shroff wrote:
>>
>>
>> On 5/21/24 23:37, Keith Busch wrote:
>>> From: Keith Busch <kbusch at kernel.org>
>>>
>>> There are io stats accounting that needs to be handled, so don't call
>>> blk_mq_end_request() directly. Use the existing nvme_end_req() helper
>>> that already handles everything.
>>>
>> The changes look good however I have a question about why do we retry an IO
>> when that IO is cancelled? For instance, when a multipath IO request is cancelled 
>> (from nvme_cancel_request()) we re-queue the bio in nvme_failover_req().
>> Similarly, for non-multipath request, we do retry request in nvme_retry_req()
>> until retries for a request are maxed out by nvme_max_retries. So wouldn't it be 
>> appropriate to drop the cancelled request instead of retrying? 
>>
>> However, I do understand retrying a request on a different path when we got the 
>> request completion status specifying the path related error.
> 
> A cancelled request just means the host thinks the target failed to
> produce a response. It doesn't mean the host stopped caring about the
> command; the host still wants it to succeed, but determined corrective
> action is needed to reclaim and resubmit the command.
> 
Thank Keith, got it!

--Nilay