[PATCH] fix: nvme_update_ns_info method should be called even if nvme_ms_ids_equal return false
Tao Jin
me at kingtous.cn
Fri Apr 8 17:58:27 PDT 2022
Thanks for your kind reply.
The output from command "nvme ns-descs /dev/nvme0n1" shows below:
before suspend:
NVME Namespace Identification Descriptors NS 1:
uuid : 01000000-0000-0000-0000-000000000000
but after suspend, uuid seems disapeared:
NVME Namespace Identification Descriptors NS 1:
eui64 : 0100000000000000
If I do more suspend operations, the output is the same:
NVME Namespace Identification Descriptors NS 1:
eui64 : 0100000000000000
Note that I'm using the kernel which customed by myself, which comments
out "goto out_free_id". It means "nvme_update_ns_info" will be called
even if invalidate ids failed. Because I can't do suspend operation if
using official kernel, which will cause my SSD directly invisible in
Linux and trigger ext4 error, freezing the laptop.
```
static void nvme_validate_ns(struct nvme_ns *ns, struct nvme_ns_ids *ids)
{
struct nvme_id_ns *id;
int ret = NVME_SC_INVALID_NS | NVME_SC_DNR;
if (test_bit(NVME_NS_DEAD, &ns->flags))
goto out;
ret = nvme_identify_ns(ns->ctrl, ns->head->ns_id, ids, &id);
if (ret)
goto out;
ret = NVME_SC_INVALID_NS | NVME_SC_DNR;
if (!nvme_ns_ids_equal(&ns->head->ids, ids)) {
dev_err(ns->ctrl->device,
"identifiers changed for nsid %d\n", ns->head->ns_id);
- goto out_free_id;
}
ret = nvme_update_ns_info(ns, id);
out_free_id:
kfree(id);
out:
/*
* Only remove the namespace if we got a fatal error back from the
* device, otherwise ignore the error and just move on.
*
* TODO: we should probably schedule a delayed retry here.
*/
if (ret > 0 && (ret & NVME_SC_DNR))
nvme_ns_remove(ns);
}
```
In addition, Windows 10/11 has no suspend issue in this laptop. It's
really weird.
在 2022/4/9 00:04, Christoph Hellwig 写道:
> On Fri, Apr 08, 2022 at 09:18:19AM -0600, Keith Busch wrote:
>> On Fri, Apr 08, 2022 at 10:07:21AM +0200, Christoph Hellwig wrote:
>>> On Fri, Apr 08, 2022 at 03:56:49PM +0800, 金韬 wrote:
>>>> This is output from dmesg. Seems that "eui" has changed.
>>>>
>>>> [ 2.086226] loop0: detected capacity change from 0 to 8
>>>> [ 26.577001] eui changed from 0100000000000000 to 0000000000000001
>>>> [ 26.577003] nvme nvme0: identifiers changed for nsid 1
>>>
>>> Ok, looks like the device is broken and changes the EUID after power
>>> cycles. Can you send the output of lspci -v?
>>>
>>> Also just out of curiousity, does the ID keep changing if you do more
>>> suspend cycles?
>>
>> The eui isn't legit in the first place (no OUI), and appears to be swqpping the
>
> Yes.
>
>> byte order during resume. This should be reported to the vendor.
>
> Well, the id-ns output posted earlier shows the same output before and
> after resume. Which is really weird.
>
> Either way we'll have to quirk it some way.
>
> Just to pointpoint this down a bit, what does
>
> nvme ns-descs /dev/nvme0n1
>
> report? I wonder if we get different IDs from the different methods
> to retrive them given that namespace allocation looks at the
> Namespace Identification Descriptor last, while revalidation only
> looks at Identify Namespace.
>
More information about the Linux-nvme
mailing list