[Bug] NVME controller down (Seagate FireCuda 530)

thomas thomas at pourriel.org
Sun Nov 20 12:08:37 PST 2022


Hello,

Following a recommendation from https://bugzilla.kernel.org/show_bug.cgi?id=216709 , I'm posting this here, in the hope that it finds a resolution.

# ISSUE DESCRIPTION
A few seconds after GNOME is started, the NVME drive that is mounted (but which is not used to boot from) is inacessible. dmesg output is:

> [  281.692677] nvme nvme0: controller is down; will reset: CSTS=0xffffffff,
> PCI_STATUS=0x10
> [  281.778102] nvme 0000:04:00.0: enabling device (0000 -> 0002)
> [  281.778436] nvme nvme0: Removing after probe failure status: -19
> [  281.797929] nvme0n1: detected capacity change from 3907029168 to 0
> [  281.797947] blk_update_request: I/O error, dev nvme0n1, sector 2786568960
> op 0x1:(WRITE) flags 0x103000 phys_seg 1 prio class 0
> [  281.797972] Buffer I/O error on dev nvme0n1p3, logical block 308281696,
> lost async page write
> [  281.850852] FAT-fs (nvme0n1p4): unable to read boot sector to mark fs as
> dirty
> [  343.901432] EXT4-fs warning (device nvme0n1p3):
> htree_dirblock_to_tree:1067: inode #77070337: lblock 0: comm ls: error -5
> reading directory block
> [  343.902354] EXT4-fs error (device nvme0n1p3): __ext4_find_entry:1658:
> inode #77070337: comm test-nvme-write: reading directory lblock 0
> [  350.028540] Aborting journal on device nvme0n1p3-8.
> [  350.028548] Buffer I/O error on dev nvme0n1p3, logical block 223903744,
> lost sync page write
> [  350.028554] JBD2: Error -5 detected when updating journal superblock for
> nvme0n1p3-8.


# WHAT I'VE TRIED (without much luck)
- I've managed to install ZorinOS several times on the drive, so the drive works long enough to have everything copied and installed.
- On the first 2 Samsung drives, I've managed to install Windows. I did not try with the latest drive.
- This is the 3rd drive I'm testing (tried with 2 Samsung 970 EVO plus 2To before, and now with a Seagate FireCuda 530 2To)
- I've tried with either kernel parameter pcie_aspm=off or nvme_core.default_ps_max_latency_us=0 and finally with both, but I get the same result in all 3 cases.
- I've also tried with various values for nvme_core.default_ps_max_latency_us (starting with 5500)
- If I boot from the drive, starting in recovery mode, and resuming normal start, seems to postpone a bit the freeze of the system but only by a few minutes. But it still freezes.


# SOFTWARE / HARDWARE
- Linux Kernel : 5.15.0-53-generic
- Distribution : Zorin OS 16.2
- Motherboard : GIGABYTE B550 AORUS Elite V2 
- NVME Drive : Seagate FireCuda 530 2To (latest firmware)
- CPU : AMD Ryzen 5700X

Thomas.



More information about the Linux-nvme mailing list