nvmet-tcp: dead lock on host disconnect
Grupi, Elad
Elad.Grupi at dell.com
Wed Oct 20 08:41:20 PDT 2021
Hi
We are facing an issue that leads to dead lock in nvmet-tcp layer.
Our host is closing the tcp connection immediately after sending nvme-connect command, without waiting for its response.
That leads to nvmet_tcp_install_queue being blocked waiting for the inflight controller teardown Complete on flush_scheduled_work
While nvmet_tcp_release_queue_work Is blocked waiting for the sq_destroy to finish, but it never will because the connect command is stuck.
I have realized that same issue was discovered in 2018 in the rdma layer, but I don't see any patch that solved that issue.
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1795846.html
Please let me know if you have more information about this dead lock issue.
Thanks,
Elad
Internal Use - Confidential
More information about the Linux-nvme
mailing list