nvmf host shutdown hangs when nvmf controllers are in recovery/reconnect
Steve Wise
swise at opengridcomputing.com
Wed Aug 24 13:25:42 PDT 2016
> > Hey Steve,
> >
> > For some reason I can't reproduce this on my setup...
> >
> > So I'm wandering where is nvme_rdma_del_ctrl() thread stuck?
> > Probably a dump of all the kworkers would be helpful here:
> >
> > $ pids=`ps -ef | grep kworker | grep -v grep | awk {'print $2'}`
> > $ for p in $pids; do echo "$p:" ;cat /proc/$p/stack; done
> >
I can't do this because the system is crippled due to shutting down. I get the
feeling though that the del_ctrl thread isn't getting scheduled. Note that the
difference between 'reboot' and 'reboot -f' is that without the -f, iw_cxgb4
isn't unloaded before we get stuck. So there has to be some part of 'reboot'
that deletes the controllers for it to work. But I still don't know what is
stalling the reboot anyway. Some I/O pending I guess?
> > The fact that nvme1 keeps reconnecting forever, means that
> > del_ctrl() never changes the controller state. Is there an
> > nvme0 on the system that is also being removed and you don't
> > see the reconnecting thread keeps on going?
> >
nvme0 is a local nvme device on my setup.
> > My expectation would be that del_ctrl() would move the ctrl state
> > to DELETING and reconnect thread would bail-out, then the delete_work
> > should fire and delete the controller. Obviously something is not
> > happening like it should.
>
> I think I suspect what is going on...
>
> When we get a surprise disconnect from the target we queue
> a periodic reconnect (which is the sane thing to do...).
>
Or a kato timeout.
> We only move the queues out of CONNECTED when we retry
> to reconnect (after 10 seconds in the default case) but we stop
> the blk queues immediately so we are not bothered with traffic from
> now on. If delete() is kicking off in this period the queues are still
> in CONNECTED state.
>
> Part of the delete sequence is trying to issue ctrl shutdown if the
> admin queue is CONNECTED (which it is!). This request is issued but
> stuck in blk-mq waiting for the queues to start again. This might
> be the one preventing us from forward progress...
>
> Steve, care to check if the below patch makes things better?
>
This doesn't help. I'm debugging to get more details. But can you answer this:
What code initiates the ctrl deletes for the active devices as part of a
'reboot'?
> The patch tries to separate the queue flags to CONNECTED and
> DELETING. Now we will move out of CONNECTED as soon as error recovery
> kicks in (before stopping the queues) and DELETING is on when
> we start the queue deletion.
>
> --
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 23297c5f85ed..75b49c29b890 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -86,6 +86,7 @@ struct nvme_rdma_request {
>
> enum nvme_rdma_queue_flags {
> NVME_RDMA_Q_CONNECTED = (1 << 0),
> + NVME_RDMA_Q_DELETING = (1 << 1),
> };
>
> struct nvme_rdma_queue {
> @@ -612,7 +613,7 @@ static void nvme_rdma_free_queue(struct
> nvme_rdma_queue *queue)
>
> static void nvme_rdma_stop_and_free_queue(struct nvme_rdma_queue *queue)
> {
> - if (!test_and_clear_bit(NVME_RDMA_Q_CONNECTED, &queue->flags))
> + if (test_and_set_bit(NVME_RDMA_Q_DELETING, &queue->flags))
> return;
> nvme_rdma_stop_queue(queue);
> nvme_rdma_free_queue(queue);
> @@ -764,8 +765,13 @@ static void nvme_rdma_error_recovery_work(struct
> work_struct *work)
> {
> struct nvme_rdma_ctrl *ctrl = container_of(work,
> struct nvme_rdma_ctrl, err_work);
> + int i;
>
> nvme_stop_keep_alive(&ctrl->ctrl);
> +
> + for (i = 0; i < ctrl->queue_count; i++)
> + clear_bit(NVME_RDMA_Q_CONNECTED, &ctrl->queues[i].flags);
> +
> if (ctrl->queue_count > 1)
> nvme_stop_queues(&ctrl->ctrl);
> blk_mq_stop_hw_queues(ctrl->ctrl.admin_q);
> @@ -1331,7 +1337,7 @@ static int nvme_rdma_device_unplug(struct
> nvme_rdma_queue *queue)
> cancel_delayed_work_sync(&ctrl->reconnect_work);
>
> /* Disable the queue so ctrl delete won't free it */
> - if (test_and_clear_bit(NVME_RDMA_Q_CONNECTED, &queue->flags)) {
> + if (!test_and_set_bit(NVME_RDMA_Q_DELETING, &queue->flags)) {
> /* Free this queue ourselves */
> nvme_rdma_stop_queue(queue);
> nvme_rdma_destroy_queue_ib(queue);
> --
>
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
More information about the Linux-nvme
mailing list