[PATCH 0/2] nvme-fabrics: short-circuit connect retries

Thu Jul 8 21:57:52 PDT 2021

On 7/7/2021 11:29 PM, Chao Leng wrote:
> 
> 
> On 2021/6/29 19:00, Hannes Reinecke wrote:
>> On 6/28/21 3:10 AM, Chao Leng wrote:
>>>
>>>
>>> On 2021/6/27 21:39, James Smart wrote:
>>>> On 6/26/2021 5:09 AM, Hannes Reinecke wrote:
>>>>> On 6/26/21 3:03 AM, Chao Leng wrote:
>>>>>>
>>>>>>
>>>>>> On 2021/6/24 16:10, Hannes Reinecke wrote:
>>>>>>> On 6/24/21 9:29 AM, Chao Leng wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2021/6/24 13:51, Hannes Reinecke wrote:
>>>>>>>>> On 6/23/21 11:38 PM, Sagi Grimberg wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> commit f25f8ef70ce2 ("nvme-fc: short-circuit reconnect retries")
>>>>>>>>>>> allowed the fc transport to honour the DNR bit during reconnect
>>>>>>>>>>> retries, allowing to speed up error recovery.
>>>>>>>>>>
>>>>>>>>>> How does this speed up error recovery?
>>>>>>>>>
>>>>>>>>> Well, not exactly error recovery (as there is nothing to recover).
>>>>>>>>> But we won't attempt pointless retries, thereby reducing the 
>>>>>>>>> noise in
>>>>>>>>> the message log.
>>>>>>>> This conflict with the tcp and rdma target.
>>>>>>>> You may need to delete the improper NVME_SC_DNR at the target.
>>>>>>>> However, this will cause compatibility issues between different 
>>>>>>>> versions.
>>>>>>>
>>>>>>> Which ones?
>>>>>> In many scenarios, the destination sets DNR for abnormal packets,
>>>>>> but each new connection may not have the same error.
>>>>>
>>>>> This patch series is only for the DNR bit set in response to the 
>>>>> 'connect' command.
>>>>> If the target is not able to process the 'connect' command, but may 
>>>>> be so in the future it really should not set the DNR bit.
>>>>>
>>>>>>> I checked the DNR usage in the target code, and they seem to set it
>>>>>>> correctly (ie the result would not change when the command is 
>>>>>>> retried).
>>>>>>> With the possible exception of ENOSPC handling, as this is arguably
>>>>>>> dynamic and might change with a retry.
>>>>>> The DNR status of the old connection may not be relevant to the 
>>>>>> re-established connection.
>>>>>
>>>>> See above.
>>>>> We are just checking the DNR settings for the 'connect' command (or 
>>>>> any other commands being sent during initial controller 
>>>>> configuration).
>>>>> If that fails the connect never was properly initialized; if the 
>>>>> controller would return a different status after reconnect it 
>>>>> simply should not set the DNR bit ...
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Hannes
>>>>
>>>> Agreed. Since 1.3 spec says: "If set to ‘1’, indicates that if the 
>>>> same command is re-submitted to any controller in the NVM subsystem, 
>>>> then that re-submitted command is expected to fail."
>>> According to the definition of the protocol, this is not strictly 
>>> implemented on the target side.
>>> In nvme/target, there are many external errors, the DNR is set to 1.
>>> For example, abormal fabrics cmd.
>>
>> Yes; but an abnormal fabrics cmd will be abnormal even after a reconnect.
> Sorry, delayed reply due to leave.
> There are many possibilities for abnormal fabrics cmd.
> In some scenarios, if the link is re-established, the error may not 
> occur again,
> such as HBA inner error, network is abnormal or attacked etc.
>>
>>>>
>>>> Thus if the initial connect fails in this manner, any new 
>>>> association will be on a different controller, where it is now 
>>>> expected connect on that controller will fail too.  Thus - why 
>>>> continue to connect when it's expected each will fail.
>>> Agree.
>>> We should not attempt to re-establish the connection if target can 
>>> not work due to target inner error .
>>> However, now  target does not behave exactly like this, so there are 
>>> conflicts and compatibility issues.
>>
>> Note: we never said the target should _not_ send a DNR state. We only 
>> said the target should only set the DNR bit if a retry (even after 
>> reconnect or on a different controller) will not change the result.
>> Which from what I can see _is_ the case, so I do not see any issues here.
>>
>> Please give details for the issues you are concerned with.
> For example, if host send a fabric cmd, but the fabric cmd is poisoned
> due to HBA inner error, abnormal or attacked network, and then the target
> set the DNR to reply response. If do not reconnect for DNR response of
> old connection, thus the connecting can not aoto recovery. The fabric
> cmd poisoning may be transient, may success if try for reconnecting.
> so try to reconnecting is a better choice.
> If do not reconnect for DNR response, target should set DNR state just
> for target inner error.
>>
>> Cheers,
>>
>> Hannes

It really doesn't matter what you describe is happening on the back end 
of the controllers/subsystem.  The rev 1.4 spec says "if the same 
command is re-submitted to any controller in the NVM subsystem, then 
that re-submitted command is expected to fail." - So, if there's a 
chance that a reconnect would succeed, which would be on a different 
controller - then subsystem is not following that statement. So you 
shouldn't be setting DNR.   If you disagree with this behavior, it will 
need to be taken up with the NVM Express group.

-- james