Deadlock on device removal event for NVMeF target

Robert LeBlanc robert at leblancnet.us
Thu Jun 29 06:30:17 PDT 2017


Could something like this be causing the D state problem I was seeing
in iSER almost a year ago? I tried writing a patch for iSER based on
this, but it didn't help. Either the bug is not being triggered in
device removal, or I didn't line up the statuses correctly. But it
seems that things are getting stuck in the work queue and some sort of
deadlock is happening so I was hopeful that something similar may be
in iSER.

Thanks,
Robert
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Wed, Jun 28, 2017 at 12:50 AM, Sagi Grimberg <sagi at grimberg.me> wrote:
>
>>> How about the (untested) alternative below:
>>> --
>>> [PATCH] nvmet-rdma: register ib_client to not deadlock in device
>>>    removal
>>>
>>> We can deadlock in case we got to a device removal
>>> event on a queue which is already in the process of
>>> destroying the cm_id is this is blocking until all
>>> events on this cm_id will drain. On the other hand
>>> we cannot guarantee that rdma_destroy_id was invoked
>>> as we only have indication that the queue disconnect
>>> flow has been queued (the queue state is updated before
>>> the realease work has been queued).
>>>
>>> So, we leave all the queue removal to a separate ib_client
>>> to avoid this deadlock as ib_client device removal is in
>>> a different context than the cm_id itself.
>>>
>>> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
>>> ---
>>
>>
>> Yes. This patch fixes the problem I am seeing.
>
>
> Awsome,
>
> Adding your Tested-by tag.
>
> Thanks!
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



More information about the Linux-nvme mailing list