[PATCH V3 7/8] nvme: pci: recover controller reliably

Ming Lei tom.leiming at gmail.com
Fri May 4 17:16:37 PDT 2018


On Fri, May 4, 2018 at 4:28 PM, jianchao.wang
<jianchao.w.wang at oracle.com> wrote:
> Hi ming
>
> On 05/04/2018 04:02 PM, Ming Lei wrote:
>>> nvme_error_handler should invoke nvme_reset_ctrl instead of introducing another interface.
>>> Then it is more convenient to ensure that there will be only one resetting instance running.
>>>
>> But as you mentioned above, reset_work has to be splitted into two
>> contexts for handling IO timeout during wait_freeze in reset_work,
>> so single instance of nvme_reset_ctrl() may not work well.
>
> I mean the EH kthread and the reset_work which both could reset the ctrl instead of
> the pre and post rest context.
>
> Honestly, I suspect a bit that whether it is worthy to try to recover from [1].
> The Eh kthread solution could make things easier, but the codes for recovery from [1] has
> made code really complicated. It is more difficult to unify the nvme-pci, rdma and fc.

Another choice may be nested EH, which should be easier to implement:

- run the whole recovery procedures(shutdown & reset) in one single context
- and start a new context to handle new timeout during last recovery in the
same way

The two approaches is just like sync IO vs AIO.

Thanks,
Ming Lei



More information about the Linux-nvme mailing list