[PATCH] nvme-pci: fix resume after AER recovery

Klaus Jensen its at irrelevant.dk
Tue Feb 7 02:36:32 PST 2023


On Feb  7 09:29, Javier.gonz at samsung.com wrote:
> On 07.02.2023 01:51, Grochowski, Maciej wrote:
> > I have tried suggested approach, with some modification: pci_device in
> > pci_reset_secondary_bus is actually the bridge not NVMe device itself,
> > thus I checked devices behind that bridge to see if any has D0 bit and
> > base on that logic I run the custom delay.
> > 
> > Unfortunately even with this approach I see the same issue for both
> > Samsung drives, and based on kernel logs I can see that wait for
> > secondary bus reset get increased.  Thus seems like this quirk don't
> > work for some reason. (I tried also increasing delays to different
> > values but it didn't work).
> 
> Too bad.
> 
> I will write you separately to get som dumps from the device. We have
> not seen this before, so we need to understand this a bit better.
> 
> Regarding the quirk, we are looking into it. Will come with something in
> this thread later. Cc'ing Kanchan and Klaus.
> 

I dug up a PM1733 and I am not immediately able to reproduce on 6.2-rc7.

With an aer-inject error file,

  AER
  PCI_ID 0000:04:00.0
  UNCOR_STATUS MALF_TLP
  HEADER_LOG 0 1 2 3

I'm getting a Fatal error with type "Inaccessible, (Unregistered Agent
ID)", but it still recovers successfully:

  pcieport 0000:00:01.2: aer_inject: Injecting errors 00000000/00040000 into device 0000:04:00.0
  pcieport 0000:00:01.2: AER: Uncorrected (Fatal) error received: 0000:04:00.0
  nvme 0000:04:00.0: AER: PCIe Bus Error: severity=Uncorrected (Fatal), type=Inaccessible, (Unregistered Agent ID)
  nvme nvme1: frozen state error detected, reset controller
  pcieport 0000:03:00.0: AER: Downstream Port link has been reset (0)
  nvme nvme1: restart after slot reset
  nvme nvme1: Shutdown timeout set to 10 seconds
  nvme nvme1: 32/0/0 default/read/poll queues
  pcieport 0000:03:00.0: AER: device recovery successful

Maciej, can you share firmware revision information and a bit more
details on your reproducer/setup that might allow us to replicate?


Thanks,
Klaus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-nvme/attachments/20230207/fa47af0d/attachment-0001.sig>


More information about the Linux-nvme mailing list