[PATCH 03/13] libmultipath: Add path selection support

John Garry john.g.garry at oracle.com
Mon Mar 2 07:11:59 PST 2026


On 02/03/2026 12:36, Nilay Shroff wrote:
> On 2/25/26 9:02 PM, John Garry wrote:
>> Add code for path selection.
>>
>> NVMe ANA is abstracted into enum mpath_access_state. The motivation 
>> here is
>> so that SCSI ALUA can be used. Callbacks .is_disabled, .is_optimized,
>> .get_access_state are added to get the path access state.
>>
>> Path selection modes round-robin, NUMA, and queue-depth are added, same
>> as NVMe supports.
>>
>> NVMe has almost like-for-like equivalents here:
>> - __mpath_find_path() -> __nvme_find_path()
>> - mpath_find_path() -> nvme_find_path()
>>
>> and similar for all introduced callee functions.
>>
>> Functions mpath_set_iopolicy() and mpath_get_iopolicy() are added for
>> setting default iopolicy.
>>
>> A separate mpath_iopolicy structure is introduced. There is no iopolicy
>> member included in the mpath_head structure as it may not suit NVMe, 
>> where
>> iopolicy is per-subsystem and not per namespace.
>>
>> Signed-off-by: John Garry <john.g.garry at oracle.com>
>> ---
>>   include/linux/multipath.h |  36 ++++++
>>   lib/multipath.c           | 251 ++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 287 insertions(+)
>>
>> diff --git a/include/linux/multipath.h b/include/linux/multipath.h
>> index be9dd9fb83345..c964a1aba9c42 100644
>> --- a/include/linux/multipath.h
>> +++ b/include/linux/multipath.h
>> @@ -7,6 +7,22 @@
>>   extern const struct block_device_operations mpath_ops;
>> +enum mpath_iopolicy_e {
>> +    MPATH_IOPOLICY_NUMA,
>> +    MPATH_IOPOLICY_RR,
>> +    MPATH_IOPOLICY_QD,
>> +};
>> +
>> +struct mpath_iopolicy {
>> +    enum mpath_iopolicy_e    iopolicy;
>> +};
>> +
>> +enum mpath_access_state {
>> +    MPATH_STATE_OPTIMIZED,
>> +    MPATH_STATE_ACTIVE,
>> +    MPATH_STATE_INVALID    = 0xFF
>> +};
> Hmm so here we don't have MPATH_STATE_NONOPTIMIZED.
> We are morphing NVME_ANA_NONOPTIMIZED as MPATH_STATE_ACTIVE.

Yes, well it is treated the same (as NVME_ANA_NONOPTIMIZED) for path 
selection.

> Is it because SCSI doesn't have (NONOPTIMIZED) state?

It does have an active (and optimal) state, but I think that keeping 
NVMe terminology may be better for now.

> 
>> +
>>   struct mpath_disk {
>>       struct gendisk        *disk;
>>       struct kref        ref;
>> @@ -18,10 +34,16 @@ struct mpath_disk {
>>   struct mpath_device {
>>       struct list_head    siblings;
>> +    atomic_t        nr_active;
>>       struct gendisk        *disk;
>> +    int            numa_node;
>>   };
> I haven't seen any API which help set nr_active or numa_node.

I missed setting numa_node for NVMe. About nr_active, that is set/read 
by the NVMe code, like nvme_mpath_start_request(). I did try to abstract 
that function into a common helper, but it just becomes a mess.

> Do we need to have those under struct mpath_head_template ?

I think that the drivers can handle these directly.

Thanks



More information about the Linux-nvme mailing list