[PATCH 03/13] libmultipath: Add path selection support
John Garry
john.g.garry at oracle.com
Mon Mar 2 07:11:59 PST 2026
On 02/03/2026 12:36, Nilay Shroff wrote:
> On 2/25/26 9:02 PM, John Garry wrote:
>> Add code for path selection.
>>
>> NVMe ANA is abstracted into enum mpath_access_state. The motivation
>> here is
>> so that SCSI ALUA can be used. Callbacks .is_disabled, .is_optimized,
>> .get_access_state are added to get the path access state.
>>
>> Path selection modes round-robin, NUMA, and queue-depth are added, same
>> as NVMe supports.
>>
>> NVMe has almost like-for-like equivalents here:
>> - __mpath_find_path() -> __nvme_find_path()
>> - mpath_find_path() -> nvme_find_path()
>>
>> and similar for all introduced callee functions.
>>
>> Functions mpath_set_iopolicy() and mpath_get_iopolicy() are added for
>> setting default iopolicy.
>>
>> A separate mpath_iopolicy structure is introduced. There is no iopolicy
>> member included in the mpath_head structure as it may not suit NVMe,
>> where
>> iopolicy is per-subsystem and not per namespace.
>>
>> Signed-off-by: John Garry <john.g.garry at oracle.com>
>> ---
>> include/linux/multipath.h | 36 ++++++
>> lib/multipath.c | 251 ++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 287 insertions(+)
>>
>> diff --git a/include/linux/multipath.h b/include/linux/multipath.h
>> index be9dd9fb83345..c964a1aba9c42 100644
>> --- a/include/linux/multipath.h
>> +++ b/include/linux/multipath.h
>> @@ -7,6 +7,22 @@
>> extern const struct block_device_operations mpath_ops;
>> +enum mpath_iopolicy_e {
>> + MPATH_IOPOLICY_NUMA,
>> + MPATH_IOPOLICY_RR,
>> + MPATH_IOPOLICY_QD,
>> +};
>> +
>> +struct mpath_iopolicy {
>> + enum mpath_iopolicy_e iopolicy;
>> +};
>> +
>> +enum mpath_access_state {
>> + MPATH_STATE_OPTIMIZED,
>> + MPATH_STATE_ACTIVE,
>> + MPATH_STATE_INVALID = 0xFF
>> +};
> Hmm so here we don't have MPATH_STATE_NONOPTIMIZED.
> We are morphing NVME_ANA_NONOPTIMIZED as MPATH_STATE_ACTIVE.
Yes, well it is treated the same (as NVME_ANA_NONOPTIMIZED) for path
selection.
> Is it because SCSI doesn't have (NONOPTIMIZED) state?
It does have an active (and optimal) state, but I think that keeping
NVMe terminology may be better for now.
>
>> +
>> struct mpath_disk {
>> struct gendisk *disk;
>> struct kref ref;
>> @@ -18,10 +34,16 @@ struct mpath_disk {
>> struct mpath_device {
>> struct list_head siblings;
>> + atomic_t nr_active;
>> struct gendisk *disk;
>> + int numa_node;
>> };
> I haven't seen any API which help set nr_active or numa_node.
I missed setting numa_node for NVMe. About nr_active, that is set/read
by the NVMe code, like nvme_mpath_start_request(). I did try to abstract
that function into a common helper, but it just becomes a mess.
> Do we need to have those under struct mpath_head_template ?
I think that the drivers can handle these directly.
Thanks
More information about the Linux-nvme
mailing list