[PATCH 2/3] nvme-multipath: Implement new iopolicy "queue-depth"

Hannes Reinecke hare at suse.de
Wed Sep 27 00:38:45 PDT 2023


On 9/25/23 18:31, Ewan D. Milne wrote:
> The existing iopolicies are inefficient in some cases, such as
> the presence of a path with high latency. The round-robin
> policy would use that path equally with faster paths, which
> results in sub-optimal performance.
> 
> The queue-depth policy instead sends I/O requests down the path
> with the least amount of requests in its request queue. Paths
> with lower latency will clear requests more quickly and have less
> requests in their queues compared to "bad" paths. The aim is to
> use those paths the most to bring down overall latency.
> 
> [edm: patch developed by Thomas Song @ Pure Storage, fixed whitespace
>        and compilation warnings, updated MODULE_PARM description
>        and changed to use new blk_mq_queue_nr_active() instead of adding
>        additional atomic counters.]
> 
> Co-developed-by: Thomas Song <tsong at purestorage.com>
> Signed-off-by: Ewan D. Milne <emilne at redhat.com>
> ---
>   drivers/nvme/host/multipath.c | 48 +++++++++++++++++++++++++++++++++--
>   drivers/nvme/host/nvme.h      |  1 +
>   2 files changed, 47 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index 0a88d7bdc5e3..deea7fd4aa95 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -17,6 +17,7 @@ MODULE_PARM_DESC(multipath,
>   static const char *nvme_iopolicy_names[] = {
>   	[NVME_IOPOLICY_NUMA]	= "numa",
>   	[NVME_IOPOLICY_RR]	= "round-robin",
> +	[NVME_IOPOLICY_QD]      = "queue-depth",
>   };
>   
>   static int iopolicy = NVME_IOPOLICY_NUMA;
> @@ -29,6 +30,8 @@ static int nvme_set_iopolicy(const char *val, const struct kernel_param *kp)
>   		iopolicy = NVME_IOPOLICY_NUMA;
>   	else if (!strncmp(val, "round-robin", 11))
>   		iopolicy = NVME_IOPOLICY_RR;
> +	else if (!strncmp(val, "queue-depth", 11))
> +		iopolicy = NVME_IOPOLICY_QD;
>   	else
>   		return -EINVAL;
>   
> @@ -43,7 +46,7 @@ static int nvme_get_iopolicy(char *buf, const struct kernel_param *kp)
>   module_param_call(iopolicy, nvme_set_iopolicy, nvme_get_iopolicy,
>   	&iopolicy, 0644);
>   MODULE_PARM_DESC(iopolicy,
> -	"Default multipath I/O policy; 'numa' (default) or 'round-robin'");
> +	"Default multipath I/O policy; 'numa' (default) , 'round-robin' or 'queue-depth'");
>   
>   void nvme_mpath_default_iopolicy(struct nvme_subsystem *subsys)
>   {
> @@ -329,6 +332,41 @@ static struct nvme_ns *nvme_round_robin_path(struct nvme_ns_head *head,
>   	return found;
>   }
>   
> +static struct nvme_ns *nvme_queue_depth_path(struct nvme_ns_head *head,
> +		int node)
> +{
> +	struct nvme_ns *best_opt = NULL, *best_nonopt = NULL, *ns;
> +	unsigned int min_depth_opt = UINT_MAX, min_depth_nonopt = UINT_MAX;
> +	unsigned int depth;
> +
> +	list_for_each_entry_rcu(ns, &head->list, siblings) {
> +		if (nvme_path_is_disabled(ns))
> +			continue;
> +
> +		depth = blk_mq_queue_nr_active(ns->queue);

That would require an 'atomic_read' on each path for each I/O.
Which absolutely sucks for performance.

Considering that we really are only interested in a rough idea about 
path usage, wouldn't it be better to use READ_ONCE()/WRITE_ONCE()
for accumulating the nr of requests per queue?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare at suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman




More information about the Linux-nvme mailing list