[PATCH V3 1/1] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state

Maurizio Lombardi mlombard at redhat.com
Thu Mar 6 08:03:22 PST 2025


There is a potential race condition that can occur if
the target closes the socket while the host is in the CONNECTING state.

If the socket's state changes to TCP_CLOSE, the nvme_tcp_state_change()
function is invoked. However, nvme_tcp_error_recovery() is unable
to transition the controller state to NVME_CTRL_RESETTING because
the controller is still in the CONNECTING state. As a result, error
recovery is bypassed, and the controller incorrectly transitions
to the LIVE state with closed sockets.

Subsequent attempts by the host to communicate with the target
will result in an infinite loop.

Fix the bug by initiating the error recovery process in case an error
is detected in nvme_tcp_try_send() to correctly
handle the disconnection in case we missed this event
while transitioning from CONNECTING to LIVE.

Signed-off-by: Maurizio Lombardi <mlombard at redhat.com>
---

v3: Do not check for -EPIPE, just call nvme_tcp_error_recovery()

v2: commit message: clarify where the error recovery is started

 drivers/nvme/host/tcp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 8a9131c95a3d..4e2cbad3f2bc 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1304,6 +1304,7 @@ static int nvme_tcp_try_send(struct nvme_tcp_queue *queue)
 			"failed to send request %d\n", ret);
 		nvme_tcp_fail_request(queue->request);
 		nvme_tcp_done_send_req(queue);
+		nvme_tcp_error_recovery(&queue->ctrl->ctrl);
 	}
 out:
 	memalloc_noreclaim_restore(noreclaim_flag);
-- 
2.43.5




More information about the Linux-nvme mailing list