[PATCH 5/5] nvme: retry authentication commands if DNR status bit is not set

Mon Feb 13 02:33:10 PST 2023

>>> Clear the FAILFAST_DRIVER bit for authentication commands
>>> allowing them to be retried in nvme_decide_disposition() if the DNR
>>> bit is not set in the command result.
>>
>> Can you please explain what is this fixing? and is there a test
>> to reproduce this?
>>
>>>
>>> Signed-off-by: Hannes Reinecke <hare at suse.de>
>>> ---
>>>   drivers/nvme/host/auth.c | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
>>> index ef5dcd885b0c..935f340607a7 100644
>>> --- a/drivers/nvme/host/auth.c
>>> +++ b/drivers/nvme/host/auth.c
>>> @@ -87,6 +87,8 @@ static int nvme_auth_submit(struct nvme_ctrl *ctrl, 
>>> int qid,
>>>               "qid %d auth_submit failed to map, error %d\n",
>>>               qid, ret);
>>>       } else {
>>> +        /* Clear failfast flag to allow for retries */
>>> +        req->cmd_flags &= ~REQ_FAILFAST_DRIVER;
>>
>> This should come right after the allocation.
>> But what makes this special than any other fabrics/admin command?
>> Why is it different than connect for example?
> 
> It is not different; it does, however, introduce a difference in behaviour.
> And hence I didn't enable it for all commands, as previously we didn't 
> retry, and it's questionable if we should for things like 
> 'reg_read'/'reg_write' and friends.
> 
> But you are correct for connect, though; we really should retry that 
> one, too, if the DNR bit is not set.

I'm not sure. if the transport is unavailable, we cannot set DNR, which
will lead to a retry of the connect (or equivalent) command. Today the
host will schedule another full controller connect within
reconnect_delay, which is what we want right?