[bug report][bisected] blktests nvme/tcp nvme/030 failed on latest linux-block/for-next

Sagi Grimberg sagi at grimberg.me
Thu Aug 11 05:28:21 PDT 2022


>>>>>>>> nvme/030 triggered several errors during CKI tests on
>>>>>>>> linux-block/for-next, pls help check it, and feel free to let me
>>>>>>>> know if you need any test/info, thanks.
>>>>
>>>> Hi Chaitanya and Yi,
>>>>
>>>> This commit (submitted last February) simply exposes two read-only
>>>> attributes to the sysfs.
>>>
>>> Seems it was not the culprit, but nvme/030 can pass after I revert
>>> that commit on v5.19.
>>>
>>> Hi Sagi
>>>
>>> I did more testing and finally found that reverting this udev rule
>>> change in nvme-cli fix the nvme/030 failure issue,  could you check
>>> it?
>>>
>>> commit f86faaaa2a1ff319bde188dc8988be1ec054d238 (refs/bisect/bad)
>>> Author: Sagi Grimberg <sagi at grimberg.m
>>> Date:   Mon Jun 27 11:06:50 2022 +0300
>>>
>>>       udev: re-read the discovery log page when a discovery controller
>>> reconnected
>>>
>>>       When using persistent discovery controllers, if the discovery
>>>       controller loses connectivity and manage to reconnect after a while,
>>>       we need to retrieve again the discovery log page in order to learn
>>>       about possible changes that may have occurred during this time as
>>>       discovery log change events were lost.
>>>
>>>       Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
>>>       Signed-off-by: Daniel Wagner <dwagner at suse.de>
>>>       Link:
>>> https://urldefense.com/v3/__https://lore.kernel.org/r/20220627080650.1
>>> 08936-1-
>> sagi at grimberg.me__;!!LpKI!lYFKeBqI0lmp0AycSrZ6krKxEMUNjSwCO-tY
>>> -FyMAu5KLid5bBqYpfEBGaRgfGtk1c3HLXUekSSPXr6pKw$ [lore[.]kernel[.]org]
>>
>> Yes, this change is reverted now from nvme-cli...
>> I'm thinking how should we solve the original issue, the only way I can think of
>> at this moment is a "reconnected" event. Does anyone have an idea how
>> userspace can do the right thing here without it?
> 
> Hi Sagi. We had a discussion regarding this back in January (or February?).
> 
> I needed such an event on a reconnect for my project, nvme-stas:
> https://github.com/linux-nvme/nvme-stas
> 
> This event was needed so that the host could re-register with a CDC on a
> reconnect (per TP8010). At your suggestion, I added "NVME_EVENT=connected"
> in host/core.c. This has been working great for me. Maybe the udev rule
> could be modified to look for this event.

That is exactly what it does, that is why nvme discover unexpectedly
connects to all log entries, because the udev event triggers..

In order to address the problem of missed AEN while controller was
disconnected, we need to re-issue the log-page on a re-conect, not
a first connect.



More information about the Linux-nvme mailing list