PATCH V4 0/5 nvme-pci: fixes on nvme_timeout and nvme_dev_disable

Ming Lei ming.lei at redhat.com
Wed Apr 18 19:27:41 PDT 2018


On Thu, Apr 19, 2018 at 09:51:16AM +0800, jianchao.wang wrote:
> Hi Ming
> 
> Thanks for your kindly response.
> 
> On 04/18/2018 11:40 PM, Ming Lei wrote:
> >> Regarding to this patchset, it is mainly to fix the dependency between
> >> nvme_timeout and nvme_dev_disable, as your can see:
> >> nvme_timeout will invoke nvme_dev_disable, and nvme_dev_disable have to
> >> depend on nvme_timeout when controller no response.
> > Do you mean nvme_disable_io_queues()? If yes, this one has been handled
> > by wait_for_completion_io_timeout() already, and looks the block timeout
> > can be disabled simply. Or are there others?
> > 
> Here is one possible scenario currently
> 
> nvme_dev_disable // hold shutdown_lock             nvme_timeout
>   -> nvme_set_host_mem                               -> nvme_dev_disable
>     -> nvme_submit_sync_cmd                            -> try to require shutdown_lock 
>       -> __nvme_submit_sync_cmd
>         -> blk_execute_rq
>           //if sysctl_hung_task_timeout_secs == 0
>           -> wait_for_completion_io
> And maybe nvme_dev_disable need to issue other commands in the future.

OK, thanks for sharing this one, for now I think it might need to be
handled by wait_for_completion_io_timeout() for working around this issue.

> 
> Even if we could fix these kind of issues as nvme_disable_io_queues, 
> it is still a risk I think.

Yeah, I can't agree more, that is why I think the nvme time/eh code should
be refactored, and solve the current issues in a more clean/maintainable
way.

Thanks,
Ming



More information about the Linux-nvme mailing list