[PATCH v2] nvme-multipath: fix possible hang in live ns resize with ANA access
Chao Leng
lengchao at huawei.com
Wed Sep 28 18:32:31 PDT 2022
On 2022/9/29 4:10, Sagi Grimberg wrote:
> When we revalidate paths as part of ns size change (as of commit
> e7d65803e2bb), it is possible that during the path revalidation, the
> only paths that is IO capable (i.e. optimized/non-optimized) are the
> ones that ns resize was not yet informed to the host, which will cause
> inflight requests to be requeued (as we have available paths but none
> are IO capable). These requests on the requeue list are waiting for
> someone to resubmit them at some point.
>
> The IO capable paths will eventually notify the ns resize change to the
> host, but there is nothing that will kick the requeue list to resubmit
> the queued requests.
>
> Fix this by always kicking the requeue list, and if no IO capable path
> exists, these requests will just end up being queued again.
>
> A typical log that indicates that IOs are requeued:
> --
> nvme nvme1: creating 4 I/O queues.
> nvme nvme1: new ctrl: "testnqn1"
> nvme nvme2: creating 4 I/O queues.
> nvme nvme2: mapped 4/0/0 default/read/poll queues.
> nvme nvme2: new ctrl: NQN "testnqn1", addr 127.0.0.1:8009
> nvme nvme1: rescanning namespaces.
> nvme1n1: detected capacity change from 2097152 to 4194304
> block nvme1n1: no usable path - requeuing I/O
> block nvme1n1: no usable path - requeuing I/O
> block nvme1n1: no usable path - requeuing I/O
> block nvme1n1: no usable path - requeuing I/O
> block nvme1n1: no usable path - requeuing I/O
> block nvme1n1: no usable path - requeuing I/O
> block nvme1n1: no usable path - requeuing I/O
> block nvme1n1: no usable path - requeuing I/O
> block nvme1n1: no usable path - requeuing I/O
> block nvme1n1: no usable path - requeuing I/O
> nvme nvme2: rescanning namespaces.
> --
>
> Reported-by: Yogev Cohen <yogev at lightbitslabs.com>
> Fixes: e7d65803e2bb ("nvme-multipath: revalidate paths during rescan")
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
> Changes from v1:
> - fix commit msg body format
> - follow reverse-xmas declaration pattern
>
> drivers/nvme/host/multipath.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index 6ef497c75a16..1113139c9736 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -173,15 +173,17 @@ void nvme_mpath_revalidate_paths(struct nvme_ns *ns)
> {
> struct nvme_ns_head *head = ns->head;
> sector_t capacity = get_capacity(head->disk);
> + struct nvme_ns *n;
> int node;
>
> - list_for_each_entry_rcu(ns, &head->list, siblings) {
> - if (capacity != get_capacity(ns->disk))
> - clear_bit(NVME_NS_READY, &ns->flags);
> + list_for_each_entry_rcu(n, &head->list, siblings) {
> + if (capacity != get_capacity(n->disk))
> + clear_bit(NVME_NS_READY, &n->flags);
> }
>
> for_each_node(node)
> rcu_assign_pointer(head->current_path[node], NULL);
> + nvme_kick_requeue_lists(ns->ctrl);
We just need to schedule the requeue_work of the head instead of all heads.
we can do like this:
kblockd_schedule_work(&head->requeue_work);
> }
>
> static bool nvme_path_is_disabled(struct nvme_ns *ns)
>
More information about the Linux-nvme
mailing list