target crash / host hang with nvme-all.3 branch of nvme-fabrics
Sagi Grimberg
sagi at grimberg.me
Thu Jun 16 14:37:18 PDT 2016
>> How do we rely on that? __nvmet_rdma_queue_disconnect callers are
>> responsible for queue_list deletion and queue the release. I don't
>> see where are we getting it wrong.
>
> Thread 1:
>
> Moves the queues off nvmet_rdma_queue_list and and onto the
> local list in nvmet_rdma_delete_ctrl
>
> Thread 2:
>
> Gets into nvmet_rdma_cm_handler -> nvmet_rdma_queue_disconnect for one
> of the queues now on the local list. list_empty(&queue->queue_list) evaluates
> to false because the queue is on the local list, and now we have thread 1
> and 2 racing for disconnecting the queue.
But the list removal and list_empty evaluation is still under a mutex,
isn't that sufficient to avoid the race?
More information about the Linux-nvme
mailing list