[PATCH 2/2] nvmet: Fix fatal_err_work deadlock
James Smart
james.smart at broadcom.com
Sat Sep 30 12:33:26 PDT 2017
On 9/30/2017 11:09 AM, Sagi Grimberg wrote:
>> Below is a stack trace for an issue that was reported.
>>
>> What's happening is that the nvmet layer had it's controller kato
>> timeout fire, which causes it to schedule its fatal error handler
>> via the fatal_err_work element. The error handler is invoked, which
>> calls the transport delete_ctrl() entry point, and as the transport
>> tears down the controller, nvmet_sq_destroy ends up doing the final
>> put on the ctlr causing it to enter its free routine. The ctlr free
>> routine does a cancel_work_sync() on fatal_err_work element, which
>> then does a flush_work and wait_for_completion. But, as the wait is
>> in the context of the work element being flushed, its in a catch-22
>> and the thread hangs.
>
> fatal error handler was taking the assumption that that delete_ctrl
> execution is asynchronous given that controller teardown is refcounted
> by queues that are refcounted by inflight IO. This suggests that
> controller actual free is async by nature, probably should have
> documented it...
>
> Is fc's delete_ctrl blocks until all inflight IO is drained? I would
> suggest to defer this blocking routine out of the fatal_error path like
> rdma and loop. Is that something that breaks your design?
No - it really doesn't block waiting (like the host side) although it
may appear that way. Real difference is it processes the teardown in its
entirety and its possible, especially on light/idle load, the ref
counting could cause things to occur in the delete_ctrl context. Whereas
rdma and loop definitely convert over to another workq context for
teardown. Yes, I can do that too. Yes, if there are requirements like
this for a transport - please add comments/documentation. Although, as
you can see by this proposed patch, an implementation can be made in the
core that places no requirement on a transport.
-- james
More information about the Linux-nvme
mailing list