[PATCH 0/7] nvme_fc: asynchronous controller create and simple discovery
James Smart
jsmart2021 at gmail.com
Mon May 7 17:12:07 PDT 2018
This patch set modifies the fc transport such that create_ctrl results
in the OS's controller creation only and does not initially connect
to the device on the wire inline to the create_ctrl call. Instead, the
initial connect immediately schedules the background reconnect thread
to perform the initial association connect. The patchset also contains
several other cleanups found while implementing the asynchronous
connect testing.
There are two main reasons why asynchronous controller create is done:
1) It simplifies error handling and retries. The initial controller
connect attempts can be disrupted or by errors as easily as after
the controller is initially created. As the code currently stands
there has to be special retry logic and prohibitions around state
changes if errors occur during the initial connect (which the code
today does not have). With this patch set, initial connections use
the same path as a reconnect, and any error handling uses the same
paths as if the errors occurred post initial connect.
2) As the create_ctrl() call is now very fast, simplistic udev rules
can be used for auto-connect rather than involving systemd to work
around long initial connect times, especially if errors in initial
connect occur.
However, several hurdles in the common infrastructure need to be changed
in order to make this work. The initial controller creation expects the
controller to be fully connected and live on the wire before it returns
back to the cli. This gave a lot of time for the udev event to be
generated and serviced to create the corresponding /dev file. The cli
now has to be prepared that it may access the /dev file before the event
had been serviced. There is also a check in the fabrics layer to validate
the controller subnqn is what the connect request asked for. With this
patch set, FC will return before the initial connect is complete thus
the controller field being checked is not yet set. There is no reason
that it wouldn't be as the request will be what the fabric connect
requests are based off, so this checking of the nqn should be removed.
Additionally, operations such as connect-all may occur while there is
a connectivity disturbance with the discovery controller, thus the
discovery log read may fail. To circumvent the side effects of giving
up and not connecting, the cli needs to retry the discovery log reads.
Therefore, this patch set is dependent on the following modifications
that have been made to the cli:
nvme-cli: Wait for device file if not present after successful add_ctrl
github repo commit id: bb2d87d7f386
nvme-cli: Add ioctl retry support for "connect-all"
github repo commit id: d8f949c21e9d
As the patchset now allows simplistic udev scripting, this patchset also
adds the change to the transport to regenerate udev discovery events in
case the system missed the events earlier (such as boot scenarios).
James Smart (7):
nvme: remove unnecessary controller subnqn validation
nvme_fc: remove setting DNR on exception conditions
nvme_fc: retry failures to set io queue count
nvme_fc: remove reinit_request routine
nvme_fc: change controllers first connect to use reconnect path
nvme_fc: fix nulling of queue data on reconnect
nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device
drivers/nvme/host/fabrics.c | 10 --
drivers/nvme/host/fc.c | 271 ++++++++++++++++++++++++++------------------
2 files changed, 163 insertions(+), 118 deletions(-)
--
2.13.1
More information about the Linux-nvme
mailing list