Native multipath across multiple subsystem NQNs

Thu Mar 3 03:04:48 PST 2022

> On 01-Mar-2022, at 4:38 PM, Christoph Hellwig <hch at lst.de> wrote:
> 
> 
> 
> On Thu, Feb 24, 2022 at 05:53:06PM -0700, Randy Jennings wrote:
>> In addition to other considerations, migrating a namespace with hot data
>> from a different storage vendor array nondisruptive to client i/o would
>> be difficult to do without having a namespace exist under multiple NQNs.
>> One approach that has been used for SCSI is by multipathing to both
>> targets & a proxy mirroring writes; on cut-over, the other path
>> disconnects.  If the filehandle has to change, that is disruptive to the
>> client software.
>> 
>> Unifying different storage arrays behind the same NQN/virtual subsystem
>> requires coordination of nvme ctrl_ids at least, and doing that between
>> different storage vendor arrays is unlikely.  Having a mechanism to
>> migrate between different JBOF devices non-disruptively would be helpful
>> regardless of the source/destination vendors.  Such devices will
>> probably not have the option of virtual subsystems.
> 
> Yes, it does require coordination, so please coordinate.
> 
>> In other words, even with implementing virtual subsystems, we still have
>> use for non-disruptively moving a namespace between subsystems.  How
>> will this usecase be supported on Linux?

Implementing the same subsystem NQN on multiple storage arrays imposes excessive
architecture/design costs on the target.  In addition to implementing
the same subsystem NQN, the target has to ensure that all components
that make up the virtual subsystem (across multiple arrays) have
the same inventory (same namespaces with matching NSIDs and ANA
groups). This needs to be kept consistent even when there are communication
failures between those components of the virtual subsystem.
The virtual subsystem would also be required to have the same host
access control settings and similar QoS settings etc on all of the
components that make up the virtual subsystem. This coordination, in
addition to what is required for controller IDs, is extremely complicated when both
the arrays belong to the same vendor and next to impossible, if in the future,
a multi-vendor solution is developed. 

TP 4034 is designed to specifically address these migration/business continuity solutions, while
minimizing all of that complexity in the target. So it's quite useful in that aspect.

Can you please shed some light on why you consider TP 4034 retrograde? What issues are you concerned about if Linux nvme were to handle namespaces shared across subsystems?

Prashanth