[PATCH V4 1/2] nvme-tcp: Prevent infinite loop if socket closes during CONNECTING state
Sagi Grimberg
sagi at grimberg.me
Sun Apr 13 15:44:19 PDT 2025
On 04/04/2025 11:28, Maurizio Lombardi wrote:
> There is a potential race condition that can occur if
> the target closes the socket while the host is in the CONNECTING state.
>
> If the socket's state changes to TCP_CLOSE, the nvme_tcp_state_change()
> function is invoked. However, nvme_tcp_error_recovery() is unable
> to transition the controller state to NVME_CTRL_RESETTING because
> the controller is still in the CONNECTING state. As a result, error
> recovery is bypassed, and the controller incorrectly transitions
> to the LIVE state with closed sockets.
I think that the issue is that the controller moves to LIVE state - it
shouldn't.
However its not clear where this happens.
>
> Subsequent attempts by the host to communicate with the target
> will result in an infinite loop.
>
> Fix the bug by initiating the error recovery process to correctly
> handle the disconnection in case we missed this event
> while transitioning from CONNECTING to LIVE.
The problem is in the initial connect - here there is no error recovery
and we want to propagate the error to the user.
More information about the Linux-nvme
mailing list