[PATCH 1/4] nvme: authentication error are always non-retryable

Sagi Grimberg sagi at grimberg.me
Thu Mar 7 00:51:44 PST 2024



On 01/03/2024 17:26, Hannes Reinecke wrote:
> On 3/1/24 14:12, Christoph Hellwig wrote:
>>> -    if ((nvme_req(req)->status & 0x7ff) == NVME_SC_AUTH_REQUIRED)
>>> -        return AUTHENTICATE;
>>> -
>>>       if (blk_noretry_request(req) ||
>>>           (nvme_req(req)->status & NVME_SC_DNR) ||
>>>           nvme_req(req)->retries >= nvme_max_retries)
>>>           return COMPLETE;
>>>   +    if ((nvme_req(req)->status & 0x7ff) == NVME_SC_AUTH_REQUIRED)
>>> +        return AUTHENTICATE;
>>
>> This part looks fine (although I'd word the commit message
>> differently for it).
>>
>>> @@ -467,7 +467,7 @@ int nvmf_connect_admin_queue(struct nvme_ctrl 
>>> *ctrl)
>>>           if (result & NVME_CONNECT_AUTHREQ_ASCR) {
>>>               dev_warn(ctrl->device,
>>>                    "qid 0: secure concatenation is not supported\n");
>>> -            ret = NVME_SC_AUTH_REQUIRED;
>>> +            ret = NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
>>>               goto out_free_data;
>>>           }
>>>           /* Authentication required */
>>> @@ -475,14 +475,16 @@ int nvmf_connect_admin_queue(struct nvme_ctrl 
>>> *ctrl)
>>>           if (ret) {
>>>               dev_warn(ctrl->device,
>>>                    "qid 0: authentication setup failed\n");
>>> -            ret = NVME_SC_AUTH_REQUIRED;
>>> +            ret = NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
>>
>>
>> .. but the others should never use nvme status codes as they never
>> go out onto the wire.
>>
> Would love to, but to my knowledge we have the convention that NVMe 
> status codes less than 0 should _not_ be retried.

What nvme status code is less than zero?

> NVMe transport errors,
> OTOH, _should_ be retried.
Yes.
> 'connect_admin_queue' and 'connect_io_queue' now straddles the 
> boundary: technically it's an NVMe command, but in
> practice it's a transport command as there are quite a lot of steps
> before we even can send the 'connect' command.
>
> So, how do we handle the return codes from 'connect_admin_queue' ?
> The NVMe core style (with not retrying negative errors) or the NVMe
> transport style (with always retrying until we run out of retries)?

Not exactly sure what is the question here. I am still going over emails 
so there
may be some discussion around this, but it is unclear from the patch 
description
what is the issue this is solving.



More information about the Linux-nvme mailing list