[PATCH 3/4] nvme-fabrics: add tp8010 support
Belanger, Martin
Martin.Belanger at dell.com
Fri Jan 28 09:55:40 PST 2022
> >>> Wouldn't it make more sense to delegate explicit registration to
> >>> userspace (ie libnvme/nvme-cli), and leave the kernel out of it?
> >>
> >> It would. There is no reason what-so-ever to place all this register
> >> stuff needs to live in the kernel. We have a way to passthru commands
> >> so it can and should move to userspace.
> >
> > I wish it could be delegated to a user-space app, and in fact that was
> > my original design.
> >
> > Unfortunately, while testing I realized that the kernel can
> > autonomously reconnect and user-space apps are completely unaware of it.
> >
> > For example, let's say the network goes down momentarily. The kernel
> > then tries to reconnect. Once it successfully reconnects it doesn't
> > tell anyone about it.
>
> Then add a uevent on the controller device-node. From there you should
> trap it and do what you need (exactly like how discovery log change events
> are handled).
>
> BTW, I also don't understand why the host needs this reregistration on
> reconnect, but that is besides the point.
The fact is that when connectivity is lost and restored, we don’t
know whether changes have occurred on the host (e.g. maybe the
symbolic name was changed while connectivity was lost, etc.)
and thus registration must be reapplied every time the host reconnects
to make sure there is no stale information at the discovery controller.
>
> > But let's say the kernel does send a signal to user-space on a reconnect.
> > What if there is no user-space app to receive this signal? I'm
> > thinking of the case where one uses nvme-cli to set up persistent
> > connections to discovery controllers. In that case there is no app to
> > send the explicit registration on a re-connect.
>
> This argument does not justify adding functionality in the kernel that doesn't
> belong there. If we were to follow this argument we would be placing
> everything in the kernel. If someone wants this functionality, he/she needs
> to use the tools required for it to work.
Ok, Hannes, Sagi, Christoph, et al. I got the message loud and clear.
Explicit Registration does not belong in the kernel. And, as you suggested,
the kernel needs to send a uevent when it loses connectivity and
another uevent when connectivity is restored. This will allow userspace
applications know when to reapply registration (if needed).
I'm not a uevent expert (at least how to send uevents from the kernel).
From what I read, there's only a limited set of uevents defined (i.e. add,
remove, move, online, offline). The nvme driver already uses the
uevents "add" and "remove" when nvme devices are "created" and
"deleted" respectively . We also have the "change" event that is used
when there's a change in the Log Pages. May I suggest that we use the
"offline" and "online" events when connectivity is "lost" and "restored"
respectively? Please let me know.
Thanks,
Martin
Internal Use - Confidential
More information about the Linux-nvme
mailing list