Are AER corrected errors worrying?

Samuel Thibault samuel.thibault at ens-lyon.org
Mon Jan 4 16:36:48 EST 2021


Samuel Thibault, le lun. 04 janv. 2021 21:12:47 +0100, a ecrit:
> Vidya Sagar wrote:
> > Since this is a laptop, I'm suspecting that ASPM states might have
> > been enabled which could be causing these errors.
> 
> Keith Busch, le lun. 04 janv. 2021 10:44:35 -0800, a ecrit:
> > Sometimes these types of errors occur from low power settings, so you
> > can try disabling the automatic management of these (assuming the
> > hardware supports it). To disable nvme specific power state transitions,
> > the kernel parameter is "nvme_core.default_ps_max_latency_us=0".
> 
> I have tried to add it,
> 
> I'll watch in the coming
> hours/days to see if that avoided the issue.

I did get one

Jan  4 22:34:53 begin kernel: [ 7165.207562] pcieport 0000:00:1d.0: AER: Corrected error received: 0000:02:00.0
Jan  4 22:34:53 begin kernel: [ 7165.213891] nvme 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
Jan  4 22:34:53 begin kernel: [ 7165.216949] nvme 0000:02:00.0:   device [15b7:5006] error status/mask=00000001/0000e000
Jan  4 22:34:53 begin kernel: [ 7165.219995] nvme 0000:02:00.0:    [ 0] RxErr

> > PCI also has automatic link power savings that you can disable with
> > parameter "pcie_aspm=off".
> 
> I'll try that if I still see errors with the nvme_core parameter.

I'm on it.

Samuel



More information about the Linux-nvme mailing list