[PATCH 2/3] nvme-multipath: Implement new iopolicy "queue-depth"
Hannes Reinecke
hare at suse.de
Wed Sep 27 00:38:45 PDT 2023
On 9/25/23 18:31, Ewan D. Milne wrote:
> The existing iopolicies are inefficient in some cases, such as
> the presence of a path with high latency. The round-robin
> policy would use that path equally with faster paths, which
> results in sub-optimal performance.
>
> The queue-depth policy instead sends I/O requests down the path
> with the least amount of requests in its request queue. Paths
> with lower latency will clear requests more quickly and have less
> requests in their queues compared to "bad" paths. The aim is to
> use those paths the most to bring down overall latency.
>
> [edm: patch developed by Thomas Song @ Pure Storage, fixed whitespace
> and compilation warnings, updated MODULE_PARM description
> and changed to use new blk_mq_queue_nr_active() instead of adding
> additional atomic counters.]
>
> Co-developed-by: Thomas Song <tsong at purestorage.com>
> Signed-off-by: Ewan D. Milne <emilne at redhat.com>
> ---
> drivers/nvme/host/multipath.c | 48 +++++++++++++++++++++++++++++++++--
> drivers/nvme/host/nvme.h | 1 +
> 2 files changed, 47 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index 0a88d7bdc5e3..deea7fd4aa95 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -17,6 +17,7 @@ MODULE_PARM_DESC(multipath,
> static const char *nvme_iopolicy_names[] = {
> [NVME_IOPOLICY_NUMA] = "numa",
> [NVME_IOPOLICY_RR] = "round-robin",
> + [NVME_IOPOLICY_QD] = "queue-depth",
> };
>
> static int iopolicy = NVME_IOPOLICY_NUMA;
> @@ -29,6 +30,8 @@ static int nvme_set_iopolicy(const char *val, const struct kernel_param *kp)
> iopolicy = NVME_IOPOLICY_NUMA;
> else if (!strncmp(val, "round-robin", 11))
> iopolicy = NVME_IOPOLICY_RR;
> + else if (!strncmp(val, "queue-depth", 11))
> + iopolicy = NVME_IOPOLICY_QD;
> else
> return -EINVAL;
>
> @@ -43,7 +46,7 @@ static int nvme_get_iopolicy(char *buf, const struct kernel_param *kp)
> module_param_call(iopolicy, nvme_set_iopolicy, nvme_get_iopolicy,
> &iopolicy, 0644);
> MODULE_PARM_DESC(iopolicy,
> - "Default multipath I/O policy; 'numa' (default) or 'round-robin'");
> + "Default multipath I/O policy; 'numa' (default) , 'round-robin' or 'queue-depth'");
>
> void nvme_mpath_default_iopolicy(struct nvme_subsystem *subsys)
> {
> @@ -329,6 +332,41 @@ static struct nvme_ns *nvme_round_robin_path(struct nvme_ns_head *head,
> return found;
> }
>
> +static struct nvme_ns *nvme_queue_depth_path(struct nvme_ns_head *head,
> + int node)
> +{
> + struct nvme_ns *best_opt = NULL, *best_nonopt = NULL, *ns;
> + unsigned int min_depth_opt = UINT_MAX, min_depth_nonopt = UINT_MAX;
> + unsigned int depth;
> +
> + list_for_each_entry_rcu(ns, &head->list, siblings) {
> + if (nvme_path_is_disabled(ns))
> + continue;
> +
> + depth = blk_mq_queue_nr_active(ns->queue);
That would require an 'atomic_read' on each path for each I/O.
Which absolutely sucks for performance.
Considering that we really are only interested in a rough idea about
path usage, wouldn't it be better to use READ_ONCE()/WRITE_ONCE()
for accumulating the nr of requests per queue?
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare at suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman
More information about the Linux-nvme
mailing list