[PATCH] nvme-core: fix io interrupt when work with dm-multipah

Thu Aug 6 22:28:02 EDT 2020

On 2020/8/7 8:03, Sagi Grimberg wrote:
>> I think the problem here is that the current BLK_STS and FAST_FAIL mechanisms
>> were designed support legacy protocols like SCSI.  They assume that all retry behavior is
>> controlled by other components in the stack.  NVMe is presenting new protocol features
>> and semantics which probably can't be effectively supported by those legacy BLK_STS
>> and FAST_FAIL mechanisms without passing more information up the stack.

> 
> Not sure how generic this new blk status would be.. It would probably
> make a lot more sense of there are other consumers for such a status
> code.
> 
> Maybe we could set it to BLK_STS_TIMEOUT with a big fat comment for why
> we are doing this...
Introduce a new BLK_STS error maybe a good choice, but is not friendly
to forward compatibility.

Now we have 3 choice:BLK_STS_TIMEOUT, BLK_STS_RESOURCE or BLK_STS_IOERR
by default. DM-multipath will fail path and then fail over to retry.
The difference is BLK_STS_RESOURCE will delay 100ms to retry,
BLK_STS_TIMEOUT and BLK_STS_IOERR will retry immediatly.

Now there is no perfect choice, but we have to choose one. I suggest
keep default, translate NVME_SC_CMD_INTERRUPTED to BLK_STS_IOERR.