nvme multipath support V7
Gruher, Joseph R
joseph.r.gruher at intel.com
Tue Apr 10 12:32:56 PDT 2018
Hi everyone-
I'd like to do some testing on NVMe multipath in an NVMeoF environment. I'm running Ubuntu kernel 4.15.15 and I've set up my kernel NVMeoF target to expose a namespace through two separate ports, then mounted the name space through both ports to my dual ported initiator, but I didn't see any multipath device created. I'm probably missing some steps. :)
Rather than send a bunch of dumb questions to the list, are there more detailed instructions or examples around for setting up the native NVMe multipath in an NVMeoF environment that I might review? A basic step by step example of setting up a simple configuration would be super helpful.
Thanks,
Joe
> -----Original Message-----
> From: Linux-nvme [mailto:linux-nvme-bounces at lists.infradead.org] On Behalf
> Of Christoph Hellwig
> Sent: Thursday, November 9, 2017 9:45 AM
> To: Busch, Keith <keith.busch at intel.com>; Sagi Grimberg <sagi at grimberg.me>
> Cc: Jens Axboe <axboe at kernel.dk>; linux-block at vger.kernel.org; Hannes
> Reinecke <hare at suse.de>; linux-nvme at lists.infradead.org; Johannes
> Thumshirn <jthumshirn at suse.de>
> Subject: nvme multipath support V7
>
> Hi all,
>
> this series adds support for multipathing, that is accessing nvme namespaces
> through multiple controllers to the nvme core driver.
>
> I think we are pretty much done with with very little changes in the last reposts.
> Unless I hear objections I plan to send this to Jens tomorrow with the remaining
> NVMe updates.
>
> It is a very thin and efficient implementation that relies on close cooperation
> with other bits of the nvme driver, and few small and simple block helpers.
>
> Compared to dm-multipath the important differences are how management of
> the paths is done, and how the I/O path works.
>
> Management of the paths is fully integrated into the nvme driver, for each
> newly found nvme controller we check if there are other controllers that refer
> to the same subsystem, and if so we link them up in the nvme driver. Then for
> each namespace found we check if the namespace id and identifiers match to
> check if we have multiple controllers that refer to the same namespaces. For
> now path availability is based entirely on the controller status, which at least
> for fabrics will be continuously updated based on the mandatory keep alive
> timer. Once the Asynchronous Namespace Access (ANA) proposal passes in
> NVMe we will also get per-namespace states in addition to that, but for now
> any details of that remain confidential to NVMe members.
>
> The I/O path is very different from the existing multipath drivers, which is
> enabled by the fact that NVMe (unlike SCSI) does not support partial
> completions - a controller will either complete a whole command or not, but
> never only complete parts of it. Because of that there is no need to clone bios
> or requests - the I/O path simply redirects the I/O to a suitable path. For
> successful commands multipath is not in the completion stack at all. For failed
> commands we decide if the error could be a path failure, and if yes remove the
> bios from the request structure and requeue them before completing the
> request. All together this means there is no performance degradation
> compared to normal nvme operation when using the multipath device node (at
> least not until I find a dual ported DRAM backed device :))
>
> A git tree is available at:
>
> git://git.infradead.org/users/hch/block.git nvme-mpath
>
> gitweb:
>
> http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/nvme-mpath
>
> Changes since V6:
> - added slaves/holder directories (Hannes)
> - trivial rebase on top of the latest nvme-4.15 changes
>
> Changes since V5:
> - dropped various prep patches merged into the nvme tree
> - removed a leftover debug printk
> - small cleanups to blk_steal_bio
> - add a sysfs attribute for hidden gendisks
> - don't complete cancelled requests if another path is available
> - use the instance index for device naming
> - make the multipath code optional at compile time
> - don't use the multipath node for single-controller subsystems
> - add a module_option to disable multipathing (temporarily)
>
> Changes since V4:
> - add a refcount to release the device in struct nvme_subsystem
> - use the instance to name the nvme_subsystems in sysfs
> - remove a NULL check before nvme_put_ns_head
> - take a ns_head reference in ->open
> - improve open protection for GENHD_FL_HIDDEN
> - add poll support for the mpath device
>
> Changes since V3:
> - new block layer support for hidden gendisks
> - a couple new patches to refactor device handling before the
> actual multipath support
> - don't expose per-controller block device nodes
> - use /dev/nvmeXnZ as the device nodes for the whole subsystem.
> - expose subsystems in sysfs (Hannes Reinecke)
> - fix a subsystem leak when duplicate NQNs are found
> - fix up some names
> - don't clear current_path if freeing a different namespace
>
> Changes since V2:
> - don't create duplicate subsystems on reset (Keith Bush)
> - free requests properly when failing over in I/O completion (Keith Bush)
> - new devices names: /dev/nvm-sub%dn%d
> - expose the namespace identification sysfs files for the mpath nodes
>
> Changes since V1:
> - introduce new nvme_ns_ids structure to clean up identifier handling
> - generic_make_request_fast is now named direct_make_request and calls
> generic_make_request_checks
> - reset bi_disk on resubmission
> - create sysfs links between the existing nvme namespace block devices and
> the new share mpath device
> - temporarily added the timeout patches from James, this should go into
> nvme-4.14, though
>
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
More information about the Linux-nvme
mailing list