[PATCH] nvme: remove multipath module parameter

John Meneghini jmeneghi at redhat.com
Tue Feb 18 15:06:05 PST 2025


On 2/18/25 12:15 PM, Keith Busch wrote:
> On Tue, Feb 18, 2025 at 11:31:58AM -0500, John Meneghini wrote:
>> On 2/18/25 10:06 AM, Keith Busch wrote:
>>> On Thu, Feb 13, 2025 at 03:37:28PM -0500, John Meneghini wrote:
>>>> Keith, Christoph and Sagi,
>>>>
>>>> This patch has been fully tested and analyzed by Red Hat's QA group and no
>>>> unexpected side effects or regressions have been found. Both NVMe/FC and NVMe/TCP
>>>> have been tested. Our QE engineer has asked me to report this upstream.
>>>
>>> What's the harm in leaving the parameter? *I* use it so I can test both
>>> ways without needing to compile multiple kernels.
>>
>> LOL.  We've been talking about this since 2017. The goal has always been to remove support for DMMP with NVMe.
> 
> I understand that disabling nvme native mp it is required for device
> mapper, and I get you want to prevent the possibility of anyone using
> dm-mp with nvme, but that isn't the only user that wants to see all
> namespace paths.

OK, maybe you have a use case in mind. I'll assume you do: some
applications want to disable native nvme multipathing and to see
all namespace paths in user space, and they are using this parameter
to do it. These are ostensibly user space MP applications.

So what happens when one of these user space MP applications needs
a change or a bug bug fix in the kernel?  Are those patches being
merged into the kernel under a different auspices... is it just
DMMP that we don't want to enable support for?
    >> We want to remove this parameter because it is causing confusion with users and customers who keep trying to use
>> DMMP with their multipath NVMe devices.
>>
>> We want to remove this parameter because:
>>
>> 1) the upstream kernel does not support multipath nvme devices without CONFIG_NVME_MULTIPATH enabled
> 
> What do you mean by "support"? I assume you mean no one upstream will
> help you debug your problems if you've set yourself up that way, and
> that's probably true. The kernel currently doesn't stop you from doing
> this though, so it's supported in that sense. Some people are fine doing
> this on their own, they're not seeking upstream help. Changing this
What I mean by "no support" is: Red Hat has proposed multiple patches that were needed
to support core.nvme_multipath=N with DMMP. Those patches have all been NAKed because
"we don't support anything but native nvme multipath with NVMe".

And we are perfectly capable of debugging and supporting all of this w/out any upstream
help. The fact is we had to patch nvme/host/core.c and multipath.c to support DMMP with
nvme.core_multipath=N, and we've carried those out-of-tree patches in RHEL for almost 6
years because they were NAKed upstream.

> would break userspace because it makes the driver fail to create device
> nodes that used to show up.

User space has already been broken. I think we crossed that bridge long ago. Long before
multipath.c was implemented people were using user space MP applications (like DMMP) to support
nvme devices with multiple paths. We took that away as a supported feature a long time ago.

The reason I am saying this is because Red Hat wants to support DMMP with NVMe. We have many customers
who are very upset and unhappy that we are taking this away. Red Hat would much rather use DMMP
with NVMe, than not, and we have been working diligently to close the feature gap between native
nvme multipath and DMMP multipath (e.g. the queue-depth patches).

>> 2) the upstream kernel does not support multipath nvme devices when core.nvme_multipath is set to N
>> 3) Non-multipath nvme devies are supported just fine with core.nvme_multipath is set to Y
>>
>> You don't need set core.nvme_multipath to N to test your devices both ways w/o recompiling the kernel.
>> All of the code paths involved here are controlled by NVME_CTRL_CMIC_MULTI_CTRL and setting core.nvme_multipath
>> to N doesn't do anything to help your single path NVMe devices. It doesn't remove multipath.c, reduce the code
>> path length or do anything else to optimize your non-NVME_CTRL_CMIC_MULTI_CTRL devices.  All it does is provide
>> an escape hatch to disable the incore multipath scheduler start creating multiple /dev/nvme%d/n%d entries so
>> that DMMP can be used with multipath capable NVMe devices.
>>
>> Personally, I'd like to remove CONFIG_NVME_MULTIPATH as well. It's just another source of confusion. Most users
>> are running Linux with the the default settings for NVME_MULTIPATH. This is what Red Hat customers use and that's
>> what's used upstream.  So what's the point?
> 
> There are devices that report CMIC and NMIC despite being single path,
> perhaps as some vestigial sr-iov feature. That adds an unnecessary layer
> for all IO to go through. Having the param makes it easy to test both
> possible driver paths. In production though, I'd expect to just disable
> the CONFIG option if that's the behavior someoone wants, so I think the
> config option ought to stay.

I get it... that's fine... I have no problem if we want to keep support for CONFIG_NVME_MULTIPATH=N.
That's what we originally agreed to and it was my mistake to send a patch the tried to remove it.

As it turns out, user space MP may be supportable with both CONFIG_NVME_MULTIPATH on or off.

Maybe we should just remove all of these dev_warnings from the code and let the users do what
they want. That's what they are going to do anyways. Just don't tell me we won't take patches
to support my user space MP application and not yours.

I am not trying to be a jerk here. Seriously. It's just that our duplicity about this subject
upstream is a little too much. We have these configurable options and parameters... if we don't
want to support them all then let's take them out. People who want continued support for user
space MP with core.nvme_multipath=N can take *this patch* out of their own out-of-tree repository.

This all comes down to who gets to carry the out-of-tree patch. If you don't want to take this patch,
then please take our DMMP multipath support patches. They are not big or invasive, they haven't
changed in years, and they are all controlled by core.nvme_multipath=N. My customers will thank
you if you do.

I just need the upstream maintainers to make up their mind. This is a change we've talked about
for years. I thought you'd want this change. I thought this would be a final end of the
"only native nvme multipath is supported" debate.

/John




More information about the Linux-nvme mailing list