[PATCH] nvmet-tcp: fix potential race of tcp socket closing accept_work

Fri Feb 12 17:22:14 EST 2021

> Sagi,
> 
> I think there is another race in that solution.
> 
> Please consider following flow:
> 
> Thread 1 taking sk_callback_lock and goes into the else block (state is TCP_ESTABLISHED), going to line:
> 	sock->sk->sk_user_data = queue;
> 
> at that point, thread 2 invokes sk_state_change, but this is still the original state_change callback because we didn't set yet to nvmet_tcp_state_change.
> 
> Thread 1 continues and set the callback to nvmet_tcp_state_change, but the sk_state_change was already called and nvmet_tcp_state_change will be never called.
> 
> In that case, there is a leak, and the queue will never be removed.

Are you saying that this is needed?
--

diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index ac2d9ed23cea..d82df6cca801 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -1486,6 +1486,7 @@ static int nvmet_tcp_set_queue_sock(struct 
nvmet_tcp_queue *queue)

         ret = 0;
         write_lock_bh(&sock->sk->sk_callback_lock);
+       lock_sock(sock->sk);
         if (sock->sk->sk_state != TCP_ESTABLISHED) {
                 /*
                  * If the socket is already closing, don't even start
@@ -1502,6 +1503,7 @@ static int nvmet_tcp_set_queue_sock(struct 
nvmet_tcp_queue *queue)
                 sock->sk->sk_write_space = nvmet_tcp_write_space;
                 queue_work_on(queue_cpu(queue), nvmet_tcp_wq, 
&queue->io_work);
         }
+       release_sock(sock->sk);
         write_unlock_bh(&sock->sk->sk_callback_lock);

         return ret;
--