[PATCH 3/3] nvme: redirect commands on dying queue

Sagi Grimberg sagi at grimberg.me
Mon Aug 17 03:46:30 EDT 2020


>>>> If a command send through nvme-multupath failed on a dying queue, 
>>>> resend it
>>>> on another path.
>>>
>>> So this is a race where we got a retry-able status from the controller
>>> (not from the host teardwon sequence) and we just happen to see
>>> a dying queue?
>>
>> I think so, maybe Chao can explain the scenario in a little more detail.
>> .
> The scenario: IO already return with non path error(such as
> NVME_SC_CMD_INTERRUPTED or NVME_SC_DATA_XFER_ERROR etc.), but is waiting
> to be processed, at the same time, delete ctrl happens, delete ctrl may
> set queue flag: QUEUE_FLAG_DYING when call nvme_remove_namespaces. Then
> for example, if fabric is rdma, delete ctrl will call
> nvme_rdma_delete_ctrl, nvme_rdma_delete_ctrl will drain qp first, thus
> the IO, which return with non path error, can not be failover retry,
> and also can not retry local, IO will interrupt.

OK



More information about the Linux-nvme mailing list