[PATCH 3/4] nvme-fabrics: add tp8010 support
Hannes Reinecke
hare at suse.de
Sat Jan 29 00:43:59 PST 2022
On 1/29/22 00:02, Belanger, Martin wrote:
>
>> On 1/28/22 18:55, Belanger, Martin wrote:
>>>>>>> Wouldn't it make more sense to delegate explicit registration to
>>>>>>> userspace (ie libnvme/nvme-cli), and leave the kernel out of it?
>>>>>>
>>>>>> It would. There is no reason what-so-ever to place all this
>>>>>> register stuff needs to live in the kernel. We have a way to
>>>>>> passthru commands so it can and should move to userspace.
>>>>>
>>>>> I wish it could be delegated to a user-space app, and in fact that
>>>>> was my original design.
>>>>>
>>>>> Unfortunately, while testing I realized that the kernel can
>>>>> autonomously reconnect and user-space apps are completely unaware
>> of it.
>>>>>
>>>>> For example, let's say the network goes down momentarily. The kernel
>>>>> then tries to reconnect. Once it successfully reconnects it doesn't
>>>>> tell anyone about it.
>>>>
>>>> Then add a uevent on the controller device-node. From there you
>>>> should trap it and do what you need (exactly like how discovery log
>>>> change events are handled).
>>>>
>>>> BTW, I also don't understand why the host needs this reregistration
>>>> on reconnect, but that is besides the point.
>>>
>>> The fact is that when connectivity is lost and restored, we don’t know
>>> whether changes have occurred on the host (e.g. maybe the symbolic
>>> name was changed while connectivity was lost, etc.) and thus
>>> registration must be reapplied every time the host reconnects to make
>>> sure there is no stale information at the discovery controller.
>>>
>>>>
>>>>> But let's say the kernel does send a signal to user-space on a reconnect.
>>>>> What if there is no user-space app to receive this signal? I'm
>>>>> thinking of the case where one uses nvme-cli to set up persistent
>>>>> connections to discovery controllers. In that case there is no app
>>>>> to send the explicit registration on a re-connect.
>>>>
>>>> This argument does not justify adding functionality in the kernel
>>>> that doesn't belong there. If we were to follow this argument we
>>>> would be placing everything in the kernel. If someone wants this
>>>> functionality, he/she needs to use the tools required for it to work.
>>>
>>> Ok, Hannes, Sagi, Christoph, et al. I got the message loud and clear.
>>> Explicit Registration does not belong in the kernel. And, as you
>>> suggested, the kernel needs to send a uevent when it loses
>>> connectivity and another uevent when connectivity is restored. This
>>> will allow userspace applications know when to reapply registration (if
>> needed).
>>>
>>> I'm not a uevent expert (at least how to send uevents from the kernel).
>>> From what I read, there's only a limited set of uevents defined (i.e.
>>> add, remove, move, online, offline). The nvme driver already uses the
>>> uevents "add" and "remove" when nvme devices are "created" and
>>> "deleted" respectively . We also have the "change" event that is used
>>> when there's a change in the Log Pages. May I suggest that we use the
>>> "offline" and "online" events when connectivity is "lost" and "restored"
>>> respectively? Please let me know.
>>>
>>
>> I guess we already have that; cf commit f6f09c15a767 ("nvme: generate
>> uevent once a multipath namespace is operational again").
>>
>> When multipath is activated (and I guess that will be for nvme-tcp) you will
>> receive a 'change' uevent whenever a path gets reinstated.
>>
>> So that will be the time when an explicit registration will need to be reapplied.
>>
>
> Just want to make sure I understand. When the connection is lost the kernel
> sends nothing. When the connection is restored the kernel sends a "change"
> uevent. And when the Log Pages have changed the kernel also sends a "change"
> uevent.
>
Yes.
> It would be nice to have different events because if we use the "change"
> uevent for both "log pages have changed" and "connection has been
> restored", then the user-space app will have to send both an "explicit
> registration" and a "get log page" whenever it receives a "change" uevent.
> It won’t be able to tell what that "change" uevent is for.
>
Au contraire.
Change events come with a full payload detailing out what the event is
about. In the case of a 'log page changed' uevent the event points to
the controller device, and payload is a string
'NVME_AEN=<aen number>'.
In the case of the 'path reinstated' uevent the event points to the
block device.
So it's easy to differentiate between those two.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare at suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
More information about the Linux-nvme
mailing list