[PATCH] nvme-multipath: add an 'ana_groups_only' module option

Hannes Reinecke hare at suse.de
Thu Feb 10 00:17:44 PST 2022


On 2/10/22 03:52, John Meneghini wrote:
> On 2/9/22 03:07, Christoph Hellwig wrote:
>> On Mon, Feb 07, 2022 at 11:00:05AM +0100, Hannes Reinecke wrote:
>>> On large installations the ANA log buffer can be exceedingly large;
>>> we've come across a controller with 49 ANA Group Descriptors and
>>> 65536 namespaces, resulting in an ANA buffer with an order-7 allocation.
>>> And this is just to validate that the namespace ID is _really_listed
>>> in the log page.
>>> So to avoid an overly large memory allocation we can leverage the
>>> 'RGO' bit when retrieving the ANA log page, and check whether the
>>> ANA group ID from the namespace is found in the ANA descriptors.
>>> That cuts down the memory allocation, and provides the same result.
>>> But to be on the safe side I've added a module option 'ana_groups_only'
>>> to switch between modes.
>>
>> How is this supposed to work?  We'll fail to see what namespaces
>> the change applies to.
>>
>> So in doubt fix the controller config to be less broken (and say hello
>> to NetApp and explain them they do not need more namespace for more
>> performance), and if that fails switch to a vmalloc allocation for
>> the buffer.
> 
> I agree with Christoph.  I don't see the point in supporting 65536 
> namespaces across 49 ana groups or controllers. The problem here is: the 
> vendor is trying to turn NVMe into SCSI.
> 
> Moreover, I don't understand how implementing this as a MODULE_PARM is 
> supposed to work.  If you configure this module parameter on it assumes 
> all NVMe-oF arrays connected to the host support RGO. What's really 
> needed here is some kind of protocol mechanism that will allow the host 
> to dynamically discovery if RGO is supported on a controller by 
> controller basis.
> 
> And this isn't an NetApp array. Just look at who's asking for this 
> change if you want a clue as to what NVMe-oF array is asking for this.  
> I know for a fact this isn't a NetApp array.
> 
Indeed, you are correct. Surprisingly we have more customers/partners 
implementing NVMe :-)

All things considered I guess I'll have to go with the kvmalloc approach.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare at suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer



More information about the Linux-nvme mailing list