[PATCH] nvme: Remove namespace when nvme_identify_ns_descs() failed
Sagi Grimberg
sagi at grimberg.me
Wed Jan 8 02:49:24 PST 2025
On 07/01/2025 10:11, Hannes Reinecke wrote:
> On 12/25/24 10:58, Sagi Grimberg wrote:
>>
>>
>>
>> On 29/11/2024 16:06, Hannes Reinecke wrote:
>>> When a namespace gets unmapped on the target during scanning
>>> nvme_identify_ns_descs() returns with a non-retryable error.
>>> With the currrent code we will ignore that error on the grounds
>>> that we failed to get information, and hence cannot make any
>>> decisions whether to keep or remove that namespace.
>>> But a non-retryable error implies that the namespace is _not_
>>> present as we cannot retry that command and will never get
>>> information about that namespace.
>>> And we need to remove the namespace during scanning, as otherwise
>>> the AEN informing us about a namespace change will find the NSID
>>> present, but nvme_validate_ns() will fail, and the namespace
>>> will never be updated with the correct information.
>>
>> Isn't that a bit harsh?
>> I would expect to see a specific status line NVME_SC_INVALID_NS or
>> equivalent for a full removal of the namespace?
>
> Does it matter? If we get a DNR status back from
> nvme_identify_ns_descs() we _cannot_ resend that command.
> Meaning we cannot get the namespace descriptors. As we
> rely on these descriptors to properly map the namespace
> we cannot correctly work with it, and we're better off
> to pretend the namespace is gone and wait for an AEN
> indicating that the namespace (or controller) state has changed.
I think it does matter. I don't think we should be removing the NS without
the controller telling us that it is actually removed.
More information about the Linux-nvme
mailing list