[PATCH] nvme: find numa distance only if controller has valid numa id

Sagi Grimberg sagi at grimberg.me
Sun Apr 14 01:30:35 PDT 2024



On 13/04/2024 12:04, Nilay Shroff wrote:
> On numa aware system where native nvme multipath is configured and
> iopolicy is set to numa but the nvme controller numa node id is
> undefined or -1 (NUMA_NO_NODE) then avoid calculating node distance
> for finding optimal io path. In such case we may access numa distance
> table with invalid index and that may potentially refer to incorrect
> memory. So this patch ensures that if the nvme controller numa node
> id is -1 then instead of calculating node distance for finding optimal
> io path, we set the numa node distance of such controller to default 10
> (LOCAL_DISTANCE).

Patch looks ok to me, but it is not clear weather this fixes a real 
issue or not.

>
> Signed-off-by: Nilay Shroff <nilay at linux.ibm.com>
> ---
>   drivers/nvme/host/multipath.c | 12 +++++++-----
>   1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index 5397fb428b24..4c73a8038978 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -240,17 +240,19 @@ static bool nvme_path_is_disabled(struct nvme_ns *ns)
>   
>   static struct nvme_ns *__nvme_find_path(struct nvme_ns_head *head, int node)
>   {
> -	int found_distance = INT_MAX, fallback_distance = INT_MAX, distance;
> +	int found_distance = INT_MAX, fallback_distance = INT_MAX;
>   	struct nvme_ns *found = NULL, *fallback = NULL, *ns;
>   
>   	list_for_each_entry_rcu(ns, &head->list, siblings) {
> +		int distance = LOCAL_DISTANCE;
> +
>   		if (nvme_path_is_disabled(ns))
>   			continue;
>   
> -		if (READ_ONCE(head->subsys->iopolicy) == NVME_IOPOLICY_NUMA)
> -			distance = node_distance(node, ns->ctrl->numa_node);
> -		else
> -			distance = LOCAL_DISTANCE;
> +		if (READ_ONCE(head->subsys->iopolicy) == NVME_IOPOLICY_NUMA) {
> +			if (ns->ctrl->numa_node != NUMA_NO_NODE)
> +				distance = node_distance(node, ns->ctrl->numa_node);
> +		}
>   
>   		switch (ns->ana_state) {
>   		case NVME_ANA_OPTIMIZED:




More information about the Linux-nvme mailing list