[bug report] workqueue: WQ_MEM_RECLAIM nvmet-wq:nvmet_rdma_release_queue_work [nvmet_rdma] is flushing !WQ_MEM_RECLAIM events:nvmet_keep_alive_timer [nvmet]

Sagi Grimberg sagi at grimberg.me
Tue Sep 27 23:34:06 PDT 2022



On 9/23/22 10:34, Yi Zhang wrote:
> Hello
> 
> Below issue was triggered with blktests nvmeof-mp/012 on 5.19.10, pls
> help check it, feel free to let me know if you need any info/test.
> 
> 
> It can be reproduced within 100 cycles run with ."/check nvmeof-mp/012"
> dmesg log:
> [  350.579057] run blktests nvmeof-mp/012 at 2022-09-23 03:07:21
> [  350.660517] null_blk: module loaded
> [  350.668386] null_blk: disk nullb0 created
> [  350.670913] null_blk: disk nullb1 created
> [  350.691777] device-mapper: table: 253:0: multipath: error getting
> device (-EBUSY)
> [  350.691799] device-mapper: ioctl: error adding target to table
> [  350.746159] rdma_rxe: loaded
> [  350.753345] infiniband env3_rxe: set active
> [  350.753356] infiniband env3_rxe: added env3
> [  350.892769] nvmet: adding nsid 1 to subsystem nvme-test
> [  350.901787] nvmet_rdma: enabling port 1 (10.0.2.222:7777)
> [  350.940242] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: address
> resolved (0): status 0 id 0000000017d82f43
> [  350.940886] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: route
> resolved  (2): status 0 id 0000000017d82f43
> [  350.940983] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma: connect
> request (4): status 0 id 00000000286b8895
> [  350.940989] nvmet_rdma:nvmet_rdma_find_get_device: nvmet_rdma:
> added env3_rxe.
> [  350.941605] nvmet_rdma:nvmet_rdma_create_queue_ib: nvmet_rdma:
> nvmet_rdma_create_queue_ib: max_cqe= 8191 max_sge= 32 sq_size = 289
> cm_id= 00000000286b8895
> [  350.941659] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: established
> (9): status 0 id 0000000017d82f43
> [  350.941713] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> established (9): status 0 id 00000000286b8895
> [  350.941778] nvmet:nvmet_start_keep_alive_timer: nvmet: ctrl 1 start
> keep-alive timer for 5 secs
> [  350.941781] nvmet: creating nvm controller 1 for subsystem
> nvme-test for NQN
> nqn.2014-08.org.nvmexpress:uuid:4721a245-d23f-4c00-b1d0-8d273c3db004.
> [  350.942139] nvme nvme0: creating 8 I/O queues.
> [  350.942158] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: address
> resolved (0): status 0 id 00000000c5dbee70
> [  350.942354] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: route
> resolved  (2): status 0 id 00000000c5dbee70
> [  350.942412] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma: connect
> request (4): status 0 id 000000000c220555
> [  350.942416] nvmet_rdma:nvmet_rdma_find_get_device: nvmet_rdma:
> added env3_rxe.
> [  350.942636] nvmet_rdma:nvmet_rdma_create_queue_ib: nvmet_rdma:
> nvmet_rdma_create_queue_ib: max_cqe= 8191 max_sge= 32 sq_size = 1153
> cm_id= 000000000c220555
> [  350.942697] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: established
> (9): status 0 id 00000000c5dbee70
> [  350.942740] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: address
> resolved (0): status 0 id 000000004eb27b13
> [  350.942747] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> established (9): status 0 id 000000000c220555
> [  350.942935] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: route
> resolved  (2): status 0 id 000000004eb27b13
> [  350.942992] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma: connect
> request (4): status 0 id 00000000e169966e
> [  350.942995] nvmet_rdma:nvmet_rdma_find_get_device: nvmet_rdma:
> added env3_rxe.
> [  350.943213] nvmet_rdma:nvmet_rdma_create_queue_ib: nvmet_rdma:
> nvmet_rdma_create_queue_ib: max_cqe= 8191 max_sge= 32 sq_size = 1153
> cm_id= 00000000e169966e
> [  350.943259] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: established
> (9): status 0 id 000000004eb27b13
> [  350.943302] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: address
> resolved (0): status 0 id 0000000025dab642
> [  350.943308] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> established (9): status 0 id 00000000e169966e
> [  350.943443] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: route
> resolved  (2): status 0 id 0000000025dab642
> [  350.943484] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma: connect
> request (4): status 0 id 00000000a5657f4f
> [  350.943487] nvmet_rdma:nvmet_rdma_find_get_device: nvmet_rdma:
> added env3_rxe.
> [  350.943677] nvmet_rdma:nvmet_rdma_create_queue_ib: nvmet_rdma:
> nvmet_rdma_create_queue_ib: max_cqe= 8191 max_sge= 32 sq_size = 1153
> cm_id= 00000000a5657f4f
> [  350.943715] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: established
> (9): status 0 id 0000000025dab642
> [  350.943746] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: address
> resolved (0): status 0 id 00000000eb92556a
> [  350.943748] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> established (9): status 0 id 00000000a5657f4f
> [  350.943880] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: route
> resolved  (2): status 0 id 00000000eb92556a
> [  350.943916] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma: connect
> request (4): status 0 id 0000000026cb2a8f
> [  350.943919] nvmet_rdma:nvmet_rdma_find_get_device: nvmet_rdma:
> added env3_rxe.
> [  350.944127] nvmet_rdma:nvmet_rdma_create_queue_ib: nvmet_rdma:
> nvmet_rdma_create_queue_ib: max_cqe= 8191 max_sge= 32 sq_size = 1153
> cm_id= 0000000026cb2a8f
> [  350.944170] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: established
> (9): status 0 id 00000000eb92556a
> [  350.944199] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: address
> resolved (0): status 0 id 0000000046325928
> [  350.944202] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> established (9): status 0 id 0000000026cb2a8f
> [  350.944336] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: route
> resolved  (2): status 0 id 0000000046325928
> [  350.944376] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma: connect
> request (4): status 0 id 00000000b31e4e71
> [  350.944378] nvmet_rdma:nvmet_rdma_find_get_device: nvmet_rdma:
> added env3_rxe.
> [  350.944568] nvmet_rdma:nvmet_rdma_create_queue_ib: nvmet_rdma:
> nvmet_rdma_create_queue_ib: max_cqe= 8191 max_sge= 32 sq_size = 1153
> cm_id= 00000000b31e4e71
> [  350.944604] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: established
> (9): status 0 id 0000000046325928
> [  350.944634] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: address
> resolved (0): status 0 id 00000000e1a31e4f
> [  350.944636] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> established (9): status 0 id 00000000b31e4e71
> [  350.944769] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: route
> resolved  (2): status 0 id 00000000e1a31e4f
> [  350.944804] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma: connect
> request (4): status 0 id 00000000c3020c44
> [  350.944807] nvmet_rdma:nvmet_rdma_find_get_device: nvmet_rdma:
> added env3_rxe.
> [  350.945029] nvmet_rdma:nvmet_rdma_create_queue_ib: nvmet_rdma:
> nvmet_rdma_create_queue_ib: max_cqe= 8191 max_sge= 32 sq_size = 1153
> cm_id= 00000000c3020c44
> [  350.945078] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: established
> (9): status 0 id 00000000e1a31e4f
> [  350.945115] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: address
> resolved (0): status 0 id 000000006accc65c
> [  350.945121] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> established (9): status 0 id 00000000c3020c44
> [  350.945267] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: route
> resolved  (2): status 0 id 000000006accc65c
> [  350.945312] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma: connect
> request (4): status 0 id 00000000ca3d146d
> [  350.945314] nvmet_rdma:nvmet_rdma_find_get_device: nvmet_rdma:
> added env3_rxe.
> [  350.945510] nvmet_rdma:nvmet_rdma_create_queue_ib: nvmet_rdma:
> nvmet_rdma_create_queue_ib: max_cqe= 8191 max_sge= 32 sq_size = 1153
> cm_id= 00000000ca3d146d
> [  350.945555] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: established
> (9): status 0 id 000000006accc65c
> [  350.945591] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: address
> resolved (0): status 0 id 00000000b0b8d1eb
> [  350.945598] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> established (9): status 0 id 00000000ca3d146d
> [  350.945735] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: route
> resolved  (2): status 0 id 00000000b0b8d1eb
> [  350.945781] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma: connect
> request (4): status 0 id 00000000969b9167
> [  350.945783] nvmet_rdma:nvmet_rdma_find_get_device: nvmet_rdma:
> added env3_rxe.
> [  350.945980] nvmet_rdma:nvmet_rdma_create_queue_ib: nvmet_rdma:
> nvmet_rdma_create_queue_ib: max_cqe= 8191 max_sge= 32 sq_size = 1153
> cm_id= 00000000969b9167
> [  350.946025] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: established
> (9): status 0 id 00000000b0b8d1eb
> [  350.946051] nvme nvme0: mapped 8/0/0 default/read/poll queues.
> [  350.946073] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> established (9): status 0 id 00000000969b9167
> [  350.946285] nvmet:nvmet_execute_io_connect: nvmet: adding queue 1 to ctrl 1.
> [  350.946337] nvmet:nvmet_execute_io_connect: nvmet: adding queue 2 to ctrl 1.
> [  350.946378] nvmet:nvmet_execute_io_connect: nvmet: adding queue 3 to ctrl 1.
> [  350.946427] nvmet:nvmet_execute_io_connect: nvmet: adding queue 4 to ctrl 1.
> [  350.946479] nvmet:nvmet_execute_io_connect: nvmet: adding queue 5 to ctrl 1.
> [  350.946540] nvmet:nvmet_execute_io_connect: nvmet: adding queue 6 to ctrl 1.
> [  350.946595] nvmet:nvmet_execute_io_connect: nvmet: adding queue 7 to ctrl 1.
> [  350.946650] nvmet:nvmet_execute_io_connect: nvmet: adding queue 8 to ctrl 1.
> [  350.946708] nvme nvme0: new ctrl: NQN "nvme-test", addr 10.0.2.222:7777
> [  350.946739] nvmet:nvmet_req_cns_error_complete: nvmet: unhandled
> identify cns 6 on qid 0
> [  350.960438] device-mapper: table: 253:0: multipath: error getting
> device (-EBUSY)
> [  350.960452] device-mapper: ioctl: error adding target to table
> [  351.074392] device-mapper: table: 253:1: multipath: error getting
> device (-EBUSY)
> [  351.074405] device-mapper: ioctl: error adding target to table
> [  355.638030] nvme nvme0: Removing ctrl: NQN "nvme-test"
> [  355.825297] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> disconnected (10): status 0 id 000000000c220555
> [  355.825306] nvmet_rdma:__nvmet_rdma_queue_disconnect: nvmet_rdma:
> cm_id= 000000000c220555 queue->state= 1
> [  355.825316] nvmet_rdma:nvmet_rdma_free_queue: nvmet_rdma: freeing queue 1
> [  355.825360] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> disconnected (10): status 0 id 00000000e169966e
> [  355.825365] nvmet_rdma:__nvmet_rdma_queue_disconnect: nvmet_rdma:
> cm_id= 00000000e169966e queue->state= 1
> [  355.825375] nvmet_rdma:nvmet_rdma_free_queue: nvmet_rdma: freeing queue 2
> [  355.825387] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0:
> disconnected (10): status 0 id 00000000c5dbee70
> [  355.825392] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: disconnect
> received - connection closed
> [  355.825419] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> disconnected (10): status 0 id 00000000a5657f4f
> [  355.825422] nvmet_rdma:__nvmet_rdma_queue_disconnect: nvmet_rdma:
> cm_id= 00000000a5657f4f queue->state= 1
> [  355.825430] nvmet_rdma:nvmet_rdma_free_queue: nvmet_rdma: freeing queue 3
> [  355.825438] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0:
> disconnected (10): status 0 id 000000004eb27b13
> [  355.825443] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: disconnect
> received - connection closed
> [  355.825504] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0:
> disconnected (10): status 0 id 0000000025dab642
> [  355.825509] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: disconnect
> received - connection closed
> [  355.825513] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> disconnected (10): status 0 id 0000000026cb2a8f
> [  355.825517] nvmet_rdma:__nvmet_rdma_queue_disconnect: nvmet_rdma:
> cm_id= 0000000026cb2a8f queue->state= 1
> [  355.825525] nvmet_rdma:nvmet_rdma_free_queue: nvmet_rdma: freeing queue 4
> [  355.825556] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> disconnected (10): status 0 id 00000000b31e4e71
> [  355.825560] nvmet_rdma:__nvmet_rdma_queue_disconnect: nvmet_rdma:
> cm_id= 00000000b31e4e71 queue->state= 1
> [  355.825569] nvmet_rdma:nvmet_rdma_free_queue: nvmet_rdma: freeing queue 5
> [  355.825577] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0:
> disconnected (10): status 0 id 00000000eb92556a
> [  355.825582] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: disconnect
> received - connection closed
> [  355.825640] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0:
> disconnected (10): status 0 id 0000000046325928
> [  355.825645] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: disconnect
> received - connection closed
> [  355.825648] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> disconnected (10): status 0 id 00000000c3020c44
> [  355.825651] nvmet_rdma:__nvmet_rdma_queue_disconnect: nvmet_rdma:
> cm_id= 00000000c3020c44 queue->state= 1
> [  355.825657] nvmet_rdma:nvmet_rdma_free_queue: nvmet_rdma: freeing queue 6
> [  355.825702] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> disconnected (10): status 0 id 00000000ca3d146d
> [  355.825705] nvmet_rdma:__nvmet_rdma_queue_disconnect: nvmet_rdma:
> cm_id= 00000000ca3d146d queue->state= 1
> [  355.825712] nvmet_rdma:nvmet_rdma_free_queue: nvmet_rdma: freeing queue 7
> [  355.825719] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0:
> disconnected (10): status 0 id 00000000e1a31e4f
> [  355.825721] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: disconnect
> received - connection closed
> [  355.825775] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0:
> disconnected (10): status 0 id 000000006accc65c
> [  355.825779] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: disconnect
> received - connection closed
> [  355.825782] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> disconnected (10): status 0 id 00000000969b9167
> [  355.825785] nvmet_rdma:__nvmet_rdma_queue_disconnect: nvmet_rdma:
> cm_id= 00000000969b9167 queue->state= 1
> [  355.825791] nvmet_rdma:nvmet_rdma_free_queue: nvmet_rdma: freeing queue 8
> [  355.825839] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0:
> disconnected (10): status 0 id 00000000b0b8d1eb
> [  355.825844] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: disconnect
> received - connection closed
> [  355.945236] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> disconnected (10): status 0 id 00000000286b8895
> [  355.945244] nvmet_rdma:__nvmet_rdma_queue_disconnect: nvmet_rdma:
> cm_id= 00000000286b8895 queue->state= 1
> [  355.945253] nvmet_rdma:nvmet_rdma_free_queue: nvmet_rdma: freeing queue 0
> [  355.968634] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: address
> resolved (0): status 0 id 000000003d55212b
> [  355.968712] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: route
> resolved  (2): status 0 id 000000003d55212b
> [  355.968794] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma: connect
> request (4): status 0 id 000000008fee8ffd
> [  355.968799] nvmet_rdma:nvmet_rdma_find_get_device: nvmet_rdma:
> added env3_rxe.
> [  355.968887] nvmet_rdma:nvmet_rdma_create_queue_ib: nvmet_rdma:
> nvmet_rdma_create_queue_ib: max_cqe= 8191 max_sge= 32 sq_size = 289
> cm_id= 000000008fee8ffd
> [  356.005186] nvmet:nvmet_stop_keep_alive_timer: nvmet: ctrl 1 stop keep-alive
> [  356.005186] nvmet:nvmet_keep_alive_timer: nvmet: ctrl 1 reschedule
> traffic based keep-alive timer
> [  356.005192] ------------[ cut here ]------------
> [  356.005204] workqueue: WQ_MEM_RECLAIM
> nvmet-wq:nvmet_rdma_release_queue_work [nvmet_rdma] is flushing
> !WQ_MEM_RECLAIM events:nvmet_keep_alive_timer [nvmet]

There was a miss in the patch converting to nvmet_wq:
--
diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index 7f4083cf953a..428317fe6da0 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -1175,7 +1175,7 @@ static void nvmet_start_ctrl(struct nvmet_ctrl *ctrl)
          * reset the keep alive timer when the controller is enabled.
          */
         if (ctrl->kato)
-               mod_delayed_work(system_wq, &ctrl->ka_work, ctrl->kato * 
HZ);
+               mod_delayed_work(nvmet_wq, &ctrl->ka_work, ctrl->kato * HZ);
  }

  static void nvmet_clear_ctrl(struct nvmet_ctrl *ctrl)
--

I'll send a patch, thanks for reporting



More information about the Linux-nvme mailing list