[PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
Hannes Reinecke
hare at suse.de
Sun Jan 12 23:50:46 PST 2025
On 1/11/25 00:16, Sagi Grimberg wrote:
>
>
>
> On 08/01/2025 17:45, Hannes Reinecke wrote:
>> On 1/8/25 11:49, Sagi Grimberg wrote:
>>>
>>>
>>>
>>> On 07/01/2025 10:11, Hannes Reinecke wrote:
>>>> On 12/25/24 10:58, Sagi Grimberg wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 29/11/2024 16:06, Hannes Reinecke wrote:
>>>>>> When a namespace gets unmapped on the target during scanning
>>>>>> nvme_identify_ns_descs() returns with a non-retryable error.
>>>>>> With the currrent code we will ignore that error on the grounds
>>>>>> that we failed to get information, and hence cannot make any
>>>>>> decisions whether to keep or remove that namespace.
>>>>>> But a non-retryable error implies that the namespace is _not_
>>>>>> present as we cannot retry that command and will never get
>>>>>> information about that namespace.
>>>>>> And we need to remove the namespace during scanning, as otherwise
>>>>>> the AEN informing us about a namespace change will find the NSID
>>>>>> present, but nvme_validate_ns() will fail, and the namespace
>>>>>> will never be updated with the correct information.
>>>>>
>>>>> Isn't that a bit harsh?
>>>>> I would expect to see a specific status line NVME_SC_INVALID_NS or
>>>>> equivalent for a full removal of the namespace?
>>>>
>>>> Does it matter? If we get a DNR status back from
>>>> nvme_identify_ns_descs() we _cannot_ resend that command.
>>>> Meaning we cannot get the namespace descriptors. As we
>>>> rely on these descriptors to properly map the namespace
>>>> we cannot correctly work with it, and we're better off
>>>> to pretend the namespace is gone and wait for an AEN
>>>> indicating that the namespace (or controller) state has changed.
>>>
>>> I think it does matter. I don't think we should be removing the NS
>>> without
>>> the controller telling us that it is actually removed.
>>
>> But what would be the recovery action here?
>> If the 'identify ns descs' command cannot be retried, how are
>> we going to map the namespace to an ns_head?
>
> Let's take a step back here. Can you describe the scenario you hit? what
> was the error
> status that you observed?
See my reply to Nilay.
Target unmaps the namespace, and we get an AEN for 'namespace changed'.
We start a scan, but by the time we retrieve the list of namespaces the
NSID has been reassigned to another namespace with a different UUID.
Then our stale namespace will be removed in nvme_validate_ns() and
not rescanned.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare at suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
More information about the Linux-nvme
mailing list