nvme multipath support V7

Gruher, Joseph R joseph.r.gruher at intel.com
Tue Apr 10 12:32:56 PDT 2018


Hi everyone-

I'd like to do some testing on NVMe multipath in an NVMeoF environment.  I'm running Ubuntu kernel 4.15.15 and I've set up my kernel NVMeoF target to expose a namespace through two separate ports, then mounted the name space through both ports to my dual ported initiator, but I didn't see any multipath device created.  I'm probably missing some steps. :)

Rather than send a bunch of dumb questions to the list, are there more detailed instructions or examples around for setting up the native NVMe multipath in an NVMeoF environment that I might review?  A basic step by step example of setting up a simple configuration would be super helpful.

Thanks,
Joe

> -----Original Message-----
> From: Linux-nvme [mailto:linux-nvme-bounces at lists.infradead.org] On Behalf
> Of Christoph Hellwig
> Sent: Thursday, November 9, 2017 9:45 AM
> To: Busch, Keith <keith.busch at intel.com>; Sagi Grimberg <sagi at grimberg.me>
> Cc: Jens Axboe <axboe at kernel.dk>; linux-block at vger.kernel.org; Hannes
> Reinecke <hare at suse.de>; linux-nvme at lists.infradead.org; Johannes
> Thumshirn <jthumshirn at suse.de>
> Subject: nvme multipath support V7
> 
> Hi all,
> 
> this series adds support for multipathing, that is accessing nvme namespaces
> through multiple controllers to the nvme core driver.
> 
> I think we are pretty much done with with very little changes in the last reposts.
> Unless I hear objections I plan to send this to Jens tomorrow with the remaining
> NVMe updates.
> 
> It is a very thin and efficient implementation that relies on close cooperation
> with other bits of the nvme driver, and few small and simple block helpers.
> 
> Compared to dm-multipath the important differences are how management of
> the paths is done, and how the I/O path works.
> 
> Management of the paths is fully integrated into the nvme driver, for each
> newly found nvme controller we check if there are other controllers that refer
> to the same subsystem, and if so we link them up in the nvme driver.  Then for
> each namespace found we check if the namespace id and identifiers match to
> check if we have multiple controllers that refer to the same namespaces.  For
> now path availability is based entirely on the controller status, which at least
> for fabrics will be continuously updated based on the mandatory keep alive
> timer.  Once the Asynchronous Namespace Access (ANA) proposal passes in
> NVMe we will also get per-namespace states in addition to that, but for now
> any details of that remain confidential to NVMe members.
> 
> The I/O path is very different from the existing multipath drivers, which is
> enabled by the fact that NVMe (unlike SCSI) does not support partial
> completions - a controller will either complete a whole command or not, but
> never only complete parts of it.  Because of that there is no need to clone bios
> or requests - the I/O path simply redirects the I/O to a suitable path.  For
> successful commands multipath is not in the completion stack at all.  For failed
> commands we decide if the error could be a path failure, and if yes remove the
> bios from the request structure and requeue them before completing the
> request.  All together this means there is no performance degradation
> compared to normal nvme operation when using the multipath device node (at
> least not until I find a dual ported DRAM backed device :))
> 
> A git tree is available at:
> 
>    git://git.infradead.org/users/hch/block.git nvme-mpath
> 
> gitweb:
> 
>    http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/nvme-mpath
> 
> Changes since V6:
>   - added slaves/holder directories (Hannes)
>   - trivial rebase on top of the latest nvme-4.15 changes
> 
> Changes since V5:
>   - dropped various prep patches merged into the nvme tree
>   - removed a leftover debug printk
>   - small cleanups to blk_steal_bio
>   - add a sysfs attribute for hidden gendisks
>   - don't complete cancelled requests if another path is available
>   - use the instance index for device naming
>   - make the multipath code optional at compile time
>   - don't use the multipath node for single-controller subsystems
>   - add a module_option to disable multipathing (temporarily)
> 
> Changes since V4:
>   - add a refcount to release the device in struct nvme_subsystem
>   - use the instance to name the nvme_subsystems in sysfs
>   - remove a NULL check before nvme_put_ns_head
>   - take a ns_head reference in ->open
>   - improve open protection for GENHD_FL_HIDDEN
>   - add poll support for the mpath device
> 
> Changes since V3:
>   - new block layer support for hidden gendisks
>   - a couple new patches to refactor device handling before the
>     actual multipath support
>   - don't expose per-controller block device nodes
>   - use /dev/nvmeXnZ as the device nodes for the whole subsystem.
>   - expose subsystems in sysfs (Hannes Reinecke)
>   - fix a subsystem leak when duplicate NQNs are found
>   - fix up some names
>   - don't clear current_path if freeing a different namespace
> 
> Changes since V2:
>   - don't create duplicate subsystems on reset (Keith Bush)
>   - free requests properly when failing over in I/O completion (Keith Bush)
>   - new devices names: /dev/nvm-sub%dn%d
>   - expose the namespace identification sysfs files for the mpath nodes
> 
> Changes since V1:
>   - introduce new nvme_ns_ids structure to clean up identifier handling
>   - generic_make_request_fast is now named direct_make_request and calls
>     generic_make_request_checks
>   - reset bi_disk on resubmission
>   - create sysfs links between the existing nvme namespace block devices and
>     the new share mpath device
>   - temporarily added the timeout patches from James, this should go into
>     nvme-4.14, though
> 
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme



More information about the Linux-nvme mailing list