nvmf/rdma host crash during heavy load and keep alive recovery

Mon Aug 15 07:39:26 PDT 2016

> Ah, I see the nvme_rdma worker thread running
> nvme_rdma_reconnect_ctrl_work() on the same nvme_rdma_queue that is
> handling the request and crashing:
> 
> crash> bt 371
> PID: 371    TASK: ffff8803975a4300  CPU: 5   COMMAND: "kworker/5:2"
>     [exception RIP: set_track+16]
>     RIP: ffffffff81202070  RSP: ffff880397f2ba18  RFLAGS: 00000086
>     RAX: 0000000000000001  RBX: ffff88039f407a00  RCX: ffffffffa0853234
>     RDX: 0000000000000001  RSI: ffff8801d663e008  RDI: ffff88039f407a00
>     RBP: ffff880397f2ba48   R8: ffff8801d663e158   R9: 000000000000005a
>     R10: 00000000000000cc  R11: 0000000000000000  R12: ffff8801d663e008
>     R13: ffffea0007598f80  R14: 0000000000000001  R15: ffff8801d663e008
>     CS: 0010  SS: 0018
>  #0 [ffff880397f2ba50] free_debug_processing at ffffffff81204820
>  #1 [ffff880397f2bad0] __slab_free at ffffffff81204bfb
>  #2 [ffff880397f2bb90] kfree at ffffffff81204dcd
>  #3 [ffff880397f2bc00] nvme_rdma_free_qe at ffffffffa0853234 [nvme_rdma]
>  #4 [ffff880397f2bc20] nvme_rdma_destroy_queue_ib at ffffffffa0853dbf
> [nvme_rdma]
>  #5 [ffff880397f2bc60] nvme_rdma_stop_and_free_queue at ffffffffa085402d
> [nvme_rdma]
>  #6 [ffff880397f2bc80] nvme_rdma_reconnect_ctrl_work at ffffffffa0854957
> [nvme_rdma]
>  #7 [ffff880397f2bcb0] process_one_work at ffffffff810a1593
>  #8 [ffff880397f2bd90] worker_thread at ffffffff810a222d
>  #9 [ffff880397f2bec0] kthread at ffffffff810a6d6c
> #10 [ffff880397f2bf50] ret_from_fork at ffffffff816e2cbf
> 
> So  why is this request being processed during a reconnect?

Hey Sagi, 

Do you have any ideas on this crash? I could really use some help.  Is it
possible that recovery/reconnect/restart of a different controller is somehow
restarting the requests for a controller still in recovery?  Here is one issue
perhaps:  nvme_rdma_reconnect_ctrl_work() calls blk_mq_start_stopped_hw_queues()
before calling nvme_rdma_init_io_queues().  Is that a problem?  I tried moving
blk_mq_start_stopped_hw_queues() to after the io queues are setup, but this
causes a stall in nvme_rdma_reconnect_ctrl_work().  I think the blk queues need
to be started to get the admin queue connected.   Thoughts?

Thanks,

Steve.