[PATCH v2] nvmet: Fix fatal_err_work deadlock

Christoph Hellwig hch at infradead.org
Mon Oct 23 07:50:13 PDT 2017


On Mon, Oct 23, 2017 at 02:05:08PM +0300, Sagi Grimberg wrote:
> Regardless of flush_work or cancel_work_sync, its a deadlock
> in fc.
> 
> in rdma/loop we always call ->free_ctrl from a different context.
> 
> In rdma we do that from the rdma_cm context, in loop we schedule
> host side delete on nvme_wq, in fc apparently we can get to
> free_ctrl directly from that context.

Yes, nvmet_fc_delete_ctrl -> nvmet_fc_delete_target_assoc ->
nvmet_fc_delete_target_queue.

> If fatal_err_work calls ->delete_ctrl() and that in turn gets to put the
> last reference on the ctrl it will end up in ->free_ctrl() under
> fatal_err_work context which will then try to flush fatal_err_work.

Yes, and the way I understand flush_work that is perfectly fine for
flush_work, just not for cancel_work_sync.



More information about the Linux-nvme mailing list