[PATCH] nvmet-tcp: fix race between ICReq handling and queue teardown
Keith Busch
kbusch at kernel.org
Wed Apr 8 12:22:19 PDT 2026
On Wed, Apr 08, 2026 at 12:51:31AM -0700, Chaitanya Kulkarni wrote:
> nvmet_tcp_handle_icreq() updates queue->state after sending an
> Initialization Connection Response (ICResp), but it does so without
> serializing against target-side queue teardown.
>
> If an NVMe/TCP host sends an Initialization Connection Request
> (ICReq) and immediately closes the connection, target-side teardown
> may start in softirq context before io_work drains the already
> buffered ICReq. In that case, nvmet_tcp_schedule_release_queue()
> sets queue->state to NVMET_TCP_Q_DISCONNECTING and drops the queue
> reference under state_lock.
>
> If io_work later processes that ICReq, nvmet_tcp_handle_icreq() can
> still overwrite the state back to NVMET_TCP_Q_LIVE. That defeats the
> DISCONNECTING-state guard in nvmet_tcp_schedule_release_queue() and
> allows a later socket state change to re-enter teardown and issue a
> second kref_put() on an already released queue.
>
> The ICResp send failure path has the same problem. If teardown has
> already moved the queue to DISCONNECTING, a send error can still
> overwrite the state with NVMET_TCP_Q_FAILED, again reopening the
> window for a second teardown path to drop the queue reference.
>
> Fix this by serializing both post-send state transitions with
> state_lock and bailing out if teardown has already started.
This looks okay to me. Will give this a couple days then queue it up if
no issues reported.
More information about the Linux-nvme
mailing list