[PATCH] nvme-core: fix io interrupt when work with dm-multipah

Christoph Hellwig hch at lst.de
Wed Jul 29 01:59:03 EDT 2020


On Wed, Jul 29, 2020 at 10:54:29AM +0800, Chao Leng wrote:
>
>
> On 2020/7/28 19:19, Christoph Hellwig wrote:
>> On Mon, Jul 27, 2020 at 01:58:18PM +0800, Chao Leng wrote:
>>> The protocol NVM-Express-1.4 define:
>>> Command Interrupted: Command processing was interrupted and the
>>> controller is unable to successfully complete the command. The host
>>> should retry the command. If this status code is returned, then
>>> the controller shall clear the Do Not Retry bit to ‘0’ in the Status
>>> field of the CQE (refer to Figure 124). The controller shall not return
>>> this status code unless the host has set the Advanced Command Retry
>>> Enable (ACRE) field to 1h in the Host Behavior Support feature(refer to
>>> section 5.21.1.22).
>>>
>>> According the protocol define, NVME_SC_CMD_INTERRUPTED need retry.
>>> The error code NVME_SC_CMD_INTERRUPTED should not translate to
>>> BLK_STS_TARGET, because if the error code translate to BLK_STS_TARGET,
>>> dm-multipah will return error to application. So if target return error
>>> code NVME_SC_CMD_INTERRUPTED, io will interrupt. NVME_SC_CMD_INTERRUPTED
>>> should translate to BLK_STS_IOERR by default, dm-multipath will fail
>>> over to other path retry the io.
>>
>> IOERR still seems wrong, though.
>> .
>
> BLK_STS_TARGET means target has critical error. NVME_SC_CMD_INTERRUPTED
> just means target need retry io. It is not suitable to translate
> NVME_SC_CMD_INTERRUPTED to BLK_STS_TARGET. Maybe translate to
> BLK_STS_IOERR is also not suitable, we should translate
> NVME_SC_CMD_INTERRUPTED to BLK_STS_AGAIN.
> We can do like this:

BLK_STS_AGAIN is a bad choice as we use it for calls that block when
the callers asked for non-blocking submission.  I'm really not sure
we want to change anything here - the error definition clearly states
it is not a failure but a request to retry later.



More information about the Linux-nvme mailing list