[PATCH WIP/RFC 5/6] nvme-rdma: add DELETING queue flag
Steve Wise
swise at opengridcomputing.com
Fri Aug 26 07:14:50 PDT 2016
>
> From: Sagi Grimberg <sagi at grimberg.me>
>
> When we get a surprise disconnect from the target we queue a periodic
> reconnect (which is the sane thing to do...).
>
> We only move the queues out of CONNECTED when we retry to reconnect (after
> 10 seconds in the default case) but we stop the blk queues immediately
> so we are not bothered with traffic from now on. If delete() is kicking
> off in this period the queues are still in CONNECTED state.
>
> Part of the delete sequence is trying to issue ctrl shutdown if the
> admin queue is CONNECTED (which it is!). This request is issued but
> stuck in blk-mq waiting for the queues to start again. This might be
> the one preventing us from forward progress...
>
> The patch separates the queue flags to CONNECTED and DELETING. Now we
> will move out of CONNECTED as soon as error recovery kicks in (before
> stopping the queues) and DELETING is on when we start the queue deletion.
>
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
Sagi,
This patch is missing the change to nvme_rdma_device_unplug(). That is my
mistake. Since patch 6 removes that part of the unplug logic, the omission is
benign for the series, but it should be fixed so that this patch in and of
itself fixes the problem it is addressing regardless of whether patch 6 is
applied. I can fix this up if we decide patch 6 is the correct approach...
Steve.
More information about the Linux-nvme
mailing list