blktests failures with v6.4
Sagi Grimberg
sagi at grimberg.me
Thu Jul 13 00:48:12 PDT 2023
>>> #3: nvme/003 (fabrics transport)
>>>
>>> When nvme test group is run with trtype=rdma or tcp, the test case fails
>>> due to lockdep WARNING "possible circular locking dependency detected".
>>> Reported in May/2023. Bart suggested a fix for trytpe=rdma [4] but it
>>> needs more discussion.
>>>
>>> [4] https://lore.kernel.org/linux-nvme/20230511150321.103172-1-bvanassche@acm.org/
>>
>> This patch is unfortunately incorrect and buggy.
>>
>> This will likely make the issue go away, but adds another
>> old issue where a client can DDOS a target by bombarding it
>> with connect/disconnect. When releases are async and we don't
>> have any back-pressure, it is likely to happen.
>> --
>> diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
>> index 4597bca43a6d..8b4f4aa48206 100644
>> --- a/drivers/nvme/target/rdma.c
>> +++ b/drivers/nvme/target/rdma.c
>> @@ -1582,11 +1582,6 @@ static int nvmet_rdma_queue_connect(struct rdma_cm_id
>> *cm_id,
>> goto put_device;
>> }
>>
>> - if (queue->host_qid == 0) {
>> - /* Let inflight controller teardown complete */
>> - flush_workqueue(nvmet_wq);
>> - }
>> -
>> ret = nvmet_rdma_cm_accept(cm_id, queue, &event->param.conn);
>> if (ret) {
>> /*
>> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
>> index 868aa4de2e4c..c8cfa19e11c7 100644
>> --- a/drivers/nvme/target/tcp.c
>> +++ b/drivers/nvme/target/tcp.c
>> @@ -1844,11 +1844,6 @@ static u16 nvmet_tcp_install_queue(struct nvmet_sq
>> *sq)
>> struct nvmet_tcp_queue *queue =
>> container_of(sq, struct nvmet_tcp_queue, nvme_sq);
>>
>> - if (sq->qid == 0) {
>> - /* Let inflight controller teardown complete */
>> - flush_workqueue(nvmet_wq);
>> - }
>> -
>> queue->nr_cmds = sq->size * 2;
>> if (nvmet_tcp_alloc_cmds(queue))
>> return NVME_SC_INTERNAL;
>> --
>
> Thanks Sagi, I tried the patch above and confirmed the lockdep WARN disappears
> for both rdma and tcp. It indicates that the flush_workqueue(nvmet_wq)
> introduced the circular lock dependency.
Thanks for confirming. This was expected.
> I also found the two commits below
> record why the flush_workqueue(nvmet_wq) was introduced.
>
> 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown")
> 8832cf922151 ("nvmet: use a private workqueue instead of the system workqueue")
The second patch is unrelated, before we used a global workqueue and
fundamentally had the same issue.
> The left question is how to avoid both the connect/disconnect bombarding DDOS
> and the circular lock possibility related to the nvmet_wq completion.
I don't see any way to synchronize connects with releases without moving
connect sequences to a dedicated thread. Which in my mind is undesirable.
The only solution I can think of is to fail a host connect expecting the
host to reconnect and throttle this way, but that would lead to spurious
connect failures (at least from the host PoV).
Maybe we can add a NOT_READY connect error code in nvme for that...
More information about the Linux-nvme
mailing list