[PATCH 5/5] nvme: retry authentication commands if DNR status bit is not set

Sagi Grimberg sagi at grimberg.me
Mon Feb 13 05:47:06 PST 2023


>>>>> Clear the FAILFAST_DRIVER bit for authentication commands
>>>>> allowing them to be retried in nvme_decide_disposition() if the DNR
>>>>> bit is not set in the command result.
>>>>
>>>> Can you please explain what is this fixing? and is there a test
>>>> to reproduce this?
>>>>
>>>>>
>>>>> Signed-off-by: Hannes Reinecke <hare at suse.de>
>>>>> ---
>>>>>   drivers/nvme/host/auth.c | 2 ++
>>>>>   1 file changed, 2 insertions(+)
>>>>>
>>>>> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
>>>>> index ef5dcd885b0c..935f340607a7 100644
>>>>> --- a/drivers/nvme/host/auth.c
>>>>> +++ b/drivers/nvme/host/auth.c
>>>>> @@ -87,6 +87,8 @@ static int nvme_auth_submit(struct nvme_ctrl 
>>>>> *ctrl, int qid,
>>>>>               "qid %d auth_submit failed to map, error %d\n",
>>>>>               qid, ret);
>>>>>       } else {
>>>>> +        /* Clear failfast flag to allow for retries */
>>>>> +        req->cmd_flags &= ~REQ_FAILFAST_DRIVER;
>>>>
>>>> This should come right after the allocation.
>>>> But what makes this special than any other fabrics/admin command?
>>>> Why is it different than connect for example?
>>>
>>> It is not different; it does, however, introduce a difference in 
>>> behaviour.
>>> And hence I didn't enable it for all commands, as previously we 
>>> didn't retry, and it's questionable if we should for things like 
>>> 'reg_read'/'reg_write' and friends.
>>>
>>> But you are correct for connect, though; we really should retry that 
>>> one, too, if the DNR bit is not set.
>>
>> I'm not sure. if the transport is unavailable, we cannot set DNR, which
>> will lead to a retry of the connect (or equivalent) command. Today the
>> host will schedule another full controller connect within
>> reconnect_delay, which is what we want right?
> 
> I knew this would be coming up, which is why I left it out of the 
> initial patchset.
> 
> Long explanation:
> 
> Thing is, connect handling is not _just_ connect handling.
> With linux we only have a single (ioctl-like) entry for the userspace 
> 'nvme connect' call.
> 
> That really does several things:
> - Setting up the transport
> - Create the transport association for the admin q
> - Create the admin q
> - send 'connect' over that queue
> - Create transport associations for the i/o qs
> - create i/o qs
> - send 'connect' over those queues
> 
> And my point here is that the actual 'connect' commands really are 
> straight NVMe commands. And as such should be subjected to the normal 
> status evaluation rules for NVMe commands.
> The problem here is the strange 'half-breed' nature of the connect command:
> 
> _Technically_ the association and the 'connect' command are independent, 
> and as such we would be entirely within spec to allow the 'connect'
> command to fail without having the association torn down. But due to
> our implementation that would leave us unable to re-use the association
> for further connect commands, as that would need to be triggered by
> userspace (which will always create a new association).
> 
> Consequently I would advocate for having any non-retryable failure of 
> the 'connect' command to always tear down the association, too.
> 
> That would be a deviation from our normal rules (where we always retry
> transport failures), but really is something we need to do if we enable 
> DNR evaluation for the 'connect' command.

How the host handles connect or any other request failing is entirely
the host choice.

In this patch, you decided to effectively to retry for transport errors,
which will effectively retry nvme_max_retries times. Can you please help
me understand what is the patch actually solving? I'm assuming that this
is an actual completion that you see from the controller, my second
question is what happens with this patch when the transport is the
source of the completion?

Once we understand this, we can discuss any difference between auth and
other driver generated commands.



More information about the Linux-nvme mailing list