[PATCH V3 1/3] nvmet-tcp: fix a race condition between release_queue and io_work

John Meneghini jmeneghi at redhat.com
Tue Nov 16 09:25:30 PST 2021


Reviewed-by: John Meneghini <jmeneghi at redhat.com>

On 11/16/21 10:49, Maurizio Lombardi wrote:
> If the initiator executes a reset controller operation while
> performing I/O, the target kernel will crash because of a race condition
> between release_queue and io_work;
> nvmet_tcp_uninit_data_in_cmds() may be executed while io_work
> is running, calling flush_work() was not sufficient to
> prevent this because io_work could requeue itself.
> 
> Fix this bug by using cancel_work_sync() to prevent io_work
> from requeuing itself and set rcv_state to NVMET_TCP_RECV_ERR to
> make sure we don't receive any more data from the socket.
> 
> Signed-off-by: Maurizio Lombardi <mlombard at redhat.com>
> Reviewed-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>   drivers/nvme/target/tcp.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
> index 84c387e4bf43..18f36256095f 100644
> --- a/drivers/nvme/target/tcp.c
> +++ b/drivers/nvme/target/tcp.c
> @@ -1437,7 +1437,9 @@ static void nvmet_tcp_release_queue_work(struct work_struct *w)
>   	mutex_unlock(&nvmet_tcp_queue_mutex);
>   
>   	nvmet_tcp_restore_socket_callbacks(queue);
> -	flush_work(&queue->io_work);
> +	cancel_work_sync(&queue->io_work);
> +	/* stop accepting incoming data */
> +	queue->rcv_state = NVMET_TCP_RECV_ERR;
>   
>   	nvmet_tcp_uninit_data_in_cmds(queue);
>   	nvmet_sq_destroy(&queue->nvme_sq);
> 




More information about the Linux-nvme mailing list