[PATCH v3 0/4] nvme: improve error handling and ana_state to work well with dm-multipath

Hannes Reinecke hare at suse.de
Sat May 1 12:58:00 BST 2021


On 4/20/21 5:46 PM, Laurence Oberman wrote:
[ .. ]
> 
> Let me add some reasons why as primarily a support person that this is
> important and try avoid another combative situation.
> 
> Customers depend on managing device-mapper-multipath the way it is now
> even with the advent of nvme-over-F/C. Years of administration and
> management for multiple Enterprise O/S vendor customers (Suse/Red Hat,
> Oracle) all depend on managing multipath access in a transparent way.
> 
> I respect everybody's point of view here but native does change log
> alerting and recovery and that is what will take time for customers to
> adopt.
> 
> It is going to take time for Enterprise customers to transition so all
> we want is an option for them. At some point they will move to native
> but we always like to keep in step with upstream as much as possible.
> 
> Of course we could live with RHEL-only for while but that defeats our
> intention to be as close to upstream as possible.
> 
> If we could have this accepted upstream for now perhaps when customers
> are ready to move to native only we could phase this out.
> 
> Any technical reason why this would not fly is of course important to
> consider but perhaps for now we have a parallel option until we dont.
> 
Curiously, we (as in we as SLES developers) have found just the opposite.
NVMe is a new technology, and out of necessity there will not be any 
existing installations where we have to be compatible with.
We have switched to native NVMe multipathing with SLE15, and decided to 
educate customers that NVMe is a different concept than SCSI, and one 
shouldn't try treat both the same way. This was helped by the fact the 
SLE15 is a new release, so customers were accustomed to having to change 
bits and pieces in their infrastructure to support new releases.

Overall it worked reasonably well; we sure found plenty of bugs, but 
that was kind of expected, and for bad or worse nearly all of them 
turned out to be upstream issues. Which was good for us (nothing beats 
being able to blame things on upstream, if one is careful to not linger 
too much on the fact that one is part of upstream); and upstream these 
things will need to be fixed anyway.
So we had a bit of a mixed experience, but customers seemed to be happy 
enough with this step.

Sorry about that :-)

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare at suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer



More information about the Linux-nvme mailing list