[dm-devel] hch's native NVMe multipathing [was: Re: [PATCH 1/2] Don't blacklist nvme]

Christoph Hellwig hch at infradead.org
Fri Feb 17 01:33:06 PST 2017


On Thu, Feb 16, 2017 at 10:13:37AM -0500, Mike Snitzer wrote:
> Not following what you're saying Keith did.  Please feel free to
> clarify.

Keith demonstrated what it takes to support NVMe with dm.  He also
gave a couple presentations on it in addition to various ptches on
the list.

> The middle man is useful if it can support all transports.  If it only
> supports some then yeah the utility is certainly reduced.

Again let's look at what multipathing involves:

 - discovery of multiple paths for a device, and path preferences:
   Storage protocol specific

 - handling of path state and grouping changes:
   Storage protocol specific

 - handling of path up / down events:
   Storage protocol / transport specific if provided

 - keep alive / path checking:
   Storage protocol specific with possible generic fallback

 - path selection:
   Generic, although building heavily on protocol / transport specific
   information

So most of the hard work is transport specific anyway.  And I fully
agree that generic code should be, well generic.  And with generic
I mean right in the block layer instead of involving a layer block
driver that relies on lots of low-level driver information and setup
from user space.
	
> I'm going to look at removing any scsi_dh code from DM multipath
> (someone already proposed removing the 'retain_attached_hw_handler'
> feature).  Not much point having anything in DM multipath now that scsi
> discovery has the ability to auto-attach the right scsi_dh via scsi_dh's
> .match hook.

Great.

> As a side-effect it will fix Keith's scsi_dh crash (when
> operating on NVMe request_queue).

I think we'll need to have a quick fix for that ASAP, though.

> My hope is that your NVMe equivalent for scsi_dh will "just work" (TM)
> like scsi_dh auto-attach does.  There isn't a finished ALUA equivalent
> standard for NVMe so I'd imagine at this point you have a single device
> handler for NVMe to do error translation?

Yes, error translation for the block layer, but most importantly
discovery of multiple paths to the same namespace.



More information about the Linux-nvme mailing list