[PATCH 2/2] nvmet: fix a race condition between release_queue and io_work

Maurizio Lombardi mlombard at redhat.com
Sun Nov 14 23:47:20 PST 2021


Hello Sagi,

On Sun, Nov 14, 2021 at 12:28:45PM +0200, Sagi Grimberg wrote:
> 
> 
> I think you should shutdown the socket after you restore the socket
> callbacks, and after that cancel_work_sync to deny a self-requeue.
> 
> > -	flush_work(&queue->io_work);
> > +	while (flush_work(&queue->io_work));
> >   	nvmet_tcp_uninit_data_in_cmds(queue);
> >   	nvmet_sq_destroy(&queue->nvme_sq);
> > 
> 
> How about something like this:
> --
> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
> index 6eb0b3153477..6425375faf5b 100644
> --- a/drivers/nvme/target/tcp.c
> +++ b/drivers/nvme/target/tcp.c
> @@ -1436,7 +1436,9 @@ static void nvmet_tcp_release_queue_work(struct
> work_struct *w)
>         mutex_unlock(&nvmet_tcp_queue_mutex);
> 
>         nvmet_tcp_restore_socket_callbacks(queue);
> -       flush_work(&queue->io_work);
> +       /* stop accepting incoming data */
> +       kernel_sock_shutdown(queue->sock, SHUT_RD);
> +       cancel_work_sync(&queue->io_work);
> 
>         nvmet_tcp_uninit_data_in_cmds(queue);
>         nvmet_sq_destroy(&queue->nvme_sq);
> --
> 

This one doesn't work, remember that nvmet_sq_destroy() will start io_work
again and, as I wrote in my previous email, recvmsg() might still return some
data after the socket has been shut down.

[ 1034.234574] nvmet: creating controller 2 for subsystem test-nqn for NQN nqn.2014-08.org.nvmexpress:uuid:a5f7dc87-f3f0-ea44-bfd1-0019ebbb4112. 
[ 1034.764498] nvmet: creating controller 1 for subsystem test-nqn for NQN nqn.2014-08.org.nvmexpress:uuid:a5f7dc87-f3f0-ea44-bfd1-0019ebbb4112. 
[ 1035.291398] nvmet: creating controller 2 for subsystem test-nqn for NQN nqn.2014-08.org.nvmexpress:uuid:a5f7dc87-f3f0-ea44-bfd1-0019ebbb4112. 
[ 1038.471686] nvmet: creating controller 1 for subsystem test-nqn for NQN nqn.2014-08.org.nvmexpress:uuid:a5f7dc87-f3f0-ea44-bfd1-0019ebbb4112. 
[ 1039.002853] general protection fault, probably for non-canonical address 0x51ede3866dd88e22: 0000 [#1] SMP PTI 
[ 1039.006622] CPU: 1 PID: 4869 Comm: kworker/1:2H Kdump: loaded Tainted: G        W        --------- ---  5.14.0_nvmet1+ #1 
[ 1039.008778] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 
[ 1039.009840] Workqueue: nvmet_tcp_wq nvmet_tcp_io_work [nvmet_tcp] 
[ 1039.011040] RIP: 0010:memcpy_erms+0x6/0x10 
[ 1039.011841] Code: cc cc cc cc eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe 
[ 1039.015216] RSP: 0018:ffffc02c4059fb30 EFLAGS: 00010206 
[ 1039.016176] RAX: 51ede3866dd88e22 RBX: 000000000000248e RCX: 0000000000001000 
[ 1039.017472] RDX: 0000000000001000 RSI: ffff99d7c397948e RDI: 51ede3866dd88e22 
[ 1039.018767] RBP: ffff99d7cc02b000 R08: 0000000000000001 R09: 000000000000fe88 
[ 1039.020074] R10: 000000000000012a R11: 0000000000003ed6 R12: 0000000000000000 
[ 1039.021371] R13: 0000000000001000 R14: ffff99d7cc02b010 R15: 0000000000001a48 
[ 1039.022668] FS:  0000000000000000(0000) GS:ffff99d8f7d00000(0000) knlGS:0000000000000000 
[ 1039.024134] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[ 1039.025181] CR2: 00007ffbf45c1658 CR3: 00000001051e4003 CR4: 00000000003706e0 
[ 1039.026497] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 
[ 1039.027836] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 
[ 1039.029185] Call Trace: 
[ 1039.029696]  _copy_to_iter+0x44d/0x710 
[ 1039.030483]  ? __virt_addr_valid+0x45/0x70 
[ 1039.031284]  ? __check_object_size.part.0+0x45/0x140 
[ 1039.032200]  __skb_datagram_iter+0x19b/0x2d0 
[ 1039.032990]  ? zerocopy_sg_from_iter+0x50/0x50 
[ 1039.033788]  skb_copy_datagram_iter+0x33/0x80 
[ 1039.034568]  tcp_recvmsg_locked+0x299/0x960 
[ 1039.035338]  tcp_recvmsg+0x72/0x1d0 
[ 1039.036023]  ? bpf_lsm_socket_sendmsg+0x10/0x10 
[ 1039.036929]  inet_recvmsg+0x57/0xf0 
[ 1039.037624]  nvmet_tcp_try_recv.constprop.0+0xcb/0x310 [nvmet_tcp] 
[ 1039.038913]  ? pick_next_task_fair+0x39/0x3b0 
[ 1039.039771]  nvmet_tcp_io_work+0x44/0xabd [nvmet_tcp] 
[ 1039.040684]  ? dequeue_task_fair+0xb1/0x370 
[ 1039.041430]  ? finish_task_switch.isra.0+0xb1/0x290 
[ 1039.042305]  process_one_work+0x1e6/0x380 
[ 1039.043061]  worker_thread+0x53/0x3d0 
[ 1039.043720]  ? process_one_work+0x380/0x380 
[ 1039.044467]  kthread+0x10f/0x130 
[ 1039.045059]  ? set_kthread_struct+0x40/0x40 
[ 1039.045806]  ret_from_fork+0x22/0x30 


Maurizio




More information about the Linux-nvme mailing list