[PATCH v2] nvme: core: shorten duration of multipath namespace rescan

Mon Aug 26 07:30:03 PDT 2024

On 26/08/2024 17:22, Martin Wilck wrote:
> For multipath devices, nvme_update_ns_info() needs to freeze both
> the queue of the path and the queue of the multipath device. For
> both operations, it waits for one RCU grace period to pass, ~25ms
> on my test system. By calling blk_freeze_queue_start() for the
> multipath queue early, we avoid waiting twice; tests using ftrace
> have shown that the second blk_mq_freeze_queue_wait() call finishes
> in just a few microseconds. The path queue is unfrozen before
> calling blk_mq_freeze_queue_wait() on the multipath queue, so that
> possibly outstanding IO in the multipath queue can be flushed.
>
> I tested this using the "controller rescan under I/O load" test
> I submitted recently [1].
>
> [1] https://lore.kernel.org/linux-nvme/20240822193814.106111-3-mwilck@suse.com/T/#u
>
> Signed-off-by: Martin Wilck <mwilck at suse.com>
> ---
> v2: (all changes suggested by Sagi Grimberg)
>   - patch subject changed from "nvme: core: freeze multipath queue early in
>     nvme_update_ns_info()" to "nvme: core: shorten duration of multipath
>     namespace rescan"
>   - inserted comment explaining why blk_freeze_queue_start() is called early
>   - wait for queue to be frozen even if ret != 0
>   - make code structure more obvious vs. freeze_start / freeze_wait / unfreeze

This looks a lot better, thanks.

>
> Hannes and Daniel had already added Reviewed-by: tags to the v1 patch, but
> I didn't add them above, because the patch looks quite different now.
>
> ---
>   drivers/nvme/host/core.c | 21 ++++++++++++++++-----
>   1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 0dc8bcc664f2..1ea052450846 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2217,6 +2217,16 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
>   	bool unsupported = false;
>   	int ret;
>   
> +	/*
> +	 * The controller queue is going to be frozen in
> +	 * nvme_update_ns_info_{generic,block}(). Every freeze implies waiting
> +	 * for an RCU grace period to pass. For multipath devices, we
> +	 * need to freeze the multipath queue, too. Start freezing the
> +	 * multipath queue now, lest we need to wait for two grace periods.
> +	 */
> +	if (nvme_ns_head_multipath(ns->head))
> +		blk_freeze_queue_start(ns->head->disk->queue);
> +
>   	switch (info->ids.csi) {
>   	case NVME_CSI_ZNS:
>   		if (!IS_ENABLED(CONFIG_BLK_DEV_ZONED)) {
> @@ -2250,11 +2260,13 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
>   		ret = 0;
>   	}
>   
> -	if (!ret && nvme_ns_head_multipath(ns->head)) {
> +	if (!nvme_ns_head_multipath(ns->head))
> +		return ret;
> +
> +	blk_mq_freeze_queue_wait(ns->head->disk->queue);
> +	if (!ret) {
>   		struct queue_limits *ns_lim = &ns->disk->queue->limits;
>   		struct queue_limits lim;
> -
> -		blk_mq_freeze_queue(ns->head->disk->queue);
>   		/*
>   		 * queue_limits mixes values that are the hardware limitations
>   		 * for bio splitting with what is the device configuration.
> @@ -2286,9 +2298,8 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
>   		set_capacity_and_notify(ns->head->disk, get_capacity(ns->disk));
>   		set_disk_ro(ns->head->disk, nvme_ns_is_readonly(ns, info));
>   		nvme_mpath_revalidate_paths(ns);
> -
> -		blk_mq_unfreeze_queue(ns->head->disk->queue);
>   	}
> +	blk_mq_unfreeze_queue(ns->head->disk->queue);

I'd even make it nicer with adding an out label and reverse the ret 
condition polarity.
--
blk_mq_freeze_queue_wait(ns->head->disk->queue); if (ret) goto out; ... 
... out: blk_mq_unfreeze_queue(ns->head->disk->queue);
         return ret;
--