[BUG] nvme-pci: NVMe probe fails with ENODEV

Rajat Khandelwal rajat.khandelwal at linux.intel.com
Thu Mar 9 09:06:04 PST 2023


Hi,

On 3/9/2023 8:54 PM, Keith Busch wrote:
> On Thu, Mar 09, 2023 at 04:12:18PM +0100, Christoph Hellwig wrote:
>> On Thu, Mar 09, 2023 at 07:31:07PM +0530, Rajat Khandelwal wrote:
>>> Hi,
>>> I am seeking some help regarding an issue I encounter sporadically
>>> with Samsung Portable TBT SSD X5.
>>>
>>> Right from the thunderbolt discovery to the PCIe enumeration, everything
>>> is fine, until 'NVME_REG_CSTS' is tried to be read in 'nvme_reset_work'.
>>> Precisely, 'readl(dev->bar + NVME_REG_CSTS)' fails.
>>>
>>> I handle type-C, thunderbolt and USB4 on Chrome platforms, and currently
>>> we are working on Intel Raptorlake systems.
>>> This issue has been witnessed from ADL time-frame and now is seen
>>> on RPL as well. I would really like to get to the bottom of the problem
>>> and close the issue.
>>>
>>> I have tried 5.10 and 6.1.15 kernels.
>> So we have a quirk for a device called Samsung X5 in core.c, which is a
>> bit of an unusual match.  Can you check that it gets applied for the
>> device that you are testing?
>>
>> Also if it gets applied, can you test this patch?
> That won't help here. The driver should be bailing on the device
> nvme_pci_enable() before we do the ready check:
>
> static int nvme_pci_enable(struct nvme_dev *dev)
> {
> ...
>          if (readl(dev->bar + NVME_REG_CSTS) == -1) {
>                  result = -ENODEV;
>                  goto disable;
>          }
>
> It sounds like the bridge has a valid memory window, and the kernel assigned it
> to the device, but for some reason the device didn't apply it to its BAR. Maybe
> the device just doesn't support hotplug?

The issue is sporadic in nature, witnessed even during reboots with the device
attached.
Is such a scenario even possible (BAR not getting written by the hardware)?

Thanks
Rajat




More information about the Linux-nvme mailing list