[PATCH v3 0/2] nvme-fabrics: short-circuit connect retries

Sagi Grimberg sagi at grimberg.me
Thu Mar 7 03:30:00 PST 2024



On 07/03/2024 12:37, Hannes Reinecke wrote:
> On 3/7/24 09:00, Sagi Grimberg wrote:
>>
>> On 05/03/2024 10:00, Daniel Wagner wrote:
>>> I've picked up Hannes' DNR patches. In short the make the transports 
>>> behave the same way when the DNR bit set on a re-connect attempt. We
>>> had a discussion this
>>> topic in the past and if I got this right we all agreed is that the 
>>> host should honor the DNR bit on a connect attempt [1]
>> Umm, I don't recall this being conclusive though. The spec ought to 
>> be clearer here I think.
>
> I've asked the NVMexpress fmds group, and the response was pretty 
> unanimous that the DNR bit on connect should be evaluated.

OK.

>
>>>
>>> The nvme/045 test case (authentication tests) in blktests is a good 
>>> test case for this after extending it slightly. TCP and RDMA try to
>>> reconnect with an
>>> invalid key over and over again, while loop and FC stop after the 
>>> first fail.
>>
>> Who says that invalid key is a permanent failure though?
>>
> See the response to the other patchset.
> 'Invalid key' in this context means that the _client_ evaluated the 
> key as invalid, ie the key is unusable for the client.
> As the key is passed in via the commandline there is no way the client
> can ever change the value here, and no amount of retry will change 
> things here. That's what we try to fix.

Where is this retried today, I don't see where connect failure is 
retried, outside of a periodic reconnect.
Maybe I'm missing where what is the actual failure here.



More information about the Linux-nvme mailing list