[PATCH 10/10] nvme: implement multipath access to nvme subsystems

Sagi Grimberg sagi at grimberg.me
Mon Aug 28 00:23:33 PDT 2017


> This patch adds initial multipath support to the nvme driver.  For each
> namespace we create a new block device node, which can be used to access
> that namespace through any of the controllers that refer to it.
> 
> Currently we will always send I/O to the first available path, this will
> be changed once the NVMe Asynchronous Namespace Access (ANA) TP is
> ratified and implemented, at which point we will look at the ANA state
> for each namespace.  Another possibility that was prototyped is to
> use the path that is closes to the submitting NUMA code, which will be
> mostly interesting for PCI, but might also be useful for RDMA or FC
> transports in the future.  There is not plan to implement round robin
> or I/O service time path selectors, as those are not scalable with
> the performance rates provided by NVMe.
> 
> The multipath device will go away once all paths to it disappear,
> any delay to keep it alive needs to be implemented at the controller
> level.
> 
> TODO: implement sysfs interfaces for the new subsystem and
> subsystem-namespace object.  Unless we can come up with something
> better than sysfs here..
> 
> Signed-off-by: Christoph Hellwig <hch at lst.de>

Christoph,

This is really taking a lot into the nvme driver. I'm not sure if
this approach will be used in other block driver, but would it
make sense to place the block_device node creation, the make_request
and failover logic and maybe the path selection in the block layer
leaving just the construction of the path mappings in nvme?



More information about the Linux-nvme mailing list