[PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed

Hannes Reinecke hare at suse.de
Wed Jan 15 00:02:18 PST 2025


On 1/15/25 08:48, Nilay Shroff wrote:
> 
> 
> On 1/13/25 7:59 PM, Hannes Reinecke wrote:
>> On 1/13/25 15:12, Nilay Shroff wrote:
>>>
>>>
>>> On 1/13/25 1:13 PM, Hannes Reinecke wrote:
>>>> On 1/11/25 15:01, Nilay Shroff wrote:
>>>>>
>>>>>
>> [ .. ]
>>>> So my argument is that in this specific case the 'ANA inaccessible' nvme
>>>> state should _not_ be retried, but should be treated as identical to
>>>> 'invalid namespace' errors.
>>>>
>>> I think I got what you're trying to propose. So when this issue manifests, on host, if we
>>> could possibly differentiate between nvme_identify_ns_descs() failed reasons : is it failed
>>> because the nsid has been removed/un-mapped on the target or is it failed due to "ANA inaccessible"
>>> state? IMO, for "ANA inaccessible" status, we may not want to immediately remove the ns from
>>> the host (due to reason I mentioned earlier per NVMe spec section 8.1.3.3), however for the
>>> other error case we may remove the ns from the host.
>>> I think issuing ns descriptor list command to target for a nsid which doesn't exist on the
>>> target would return buffer filled with all zeros. So that might be an indication that ns has
>>> been removed from the target.
>>>    
>> But only if the NSID has not been remapped in the meantime.
>> If it has (as in my case) the ns descriptor list will be valid, it just
>> refers to another namespace.
>>
> If NSID has been unmapped and then remapped on the targer then in that case,
> host would hit the mismatch uuid case (under nvme_validate_ns()) and so host
> would then remove the namespace.
> 
> I think there are two cases,
> Case1:
> 1. AEN triggers rescan
> 2. List of active nsid is retrieved
> -> NSID A is removed on the target
> 3. Scanning of NSID A fails (i.e. nvme_identify_ns_descs() returns buffer filled with all zeros)
> -> host removes the respective namespace
> 
> Case2:
> 1. AEN triggers rescan
> 2. List of active nsid is retrieved
> -> NSID A is unmapped and remapped (possibly with different uuid) on target
> 3. Scanning of NSID A succeed
> 4. host finds the mismatch uuid for NSID A (i.e. nvme_validate_ns() fails)
> -> host removes the respective namespace
>   
Entirely correct.
But Case2 results in the new namespace never to be scanned, and not 
visible to the OS. Which is the error I'm fighting with.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare at suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich



More information about the Linux-nvme mailing list