BTRFS error (device nvme1n1p2): bdev /dev/nvme0n1p2 errs: wr 37868055...

Justin Piszcz jpiszcz at lucidpixels.com
Tue Nov 11 10:25:02 PST 2025


On Tue, Nov 11, 2025 at 1:08 PM Chaitanya Kulkarni
<chaitanyak at nvidia.com> wrote:
>
> On 11/11/25 08:56, Chris Murphy wrote:
> >
> > On Mon, Nov 10, 2025, at 10:05 AM, Justin Piszcz wrote:
> >> Hello,
> >>
> >> I am using an ASUS Pro WS W680-ACE motherboard with 2 x Samsung SSD
> >> 990 PRO with Heatsink 4TB NVME SSDs with BTRFS R1.  When a BTRFS scrub
> >> was kicked off this morning, suddenly BTRFS was noting errors for one
> >> of the drives.  The system became unusable and I had to power cycle
> >> and re-run the scrub and everything is now OK.  My question is what
> >> would cause this?
> > We'd have to see a complete dmesg at the time the problem occurred. If the same device holds system log files, seems like a pretty good chance none of it made it to persistent storage.


> >none of it made it to persistent storage.
Absolutely correct!!  Luckily, I do have remote syslog enabled and
captured the errors that were not logged to persistent storage, do the
errors below point to an NVME firmware issue?

"2025-11-10 01:42:51","notice","user","","machine1","[4043499.402974]
BTRFS info (device nvme1n1p2): scrub: started on devid 2"
"2025-11-10 01:42:51","notice","user","","machine1","[4043499.403000]
BTRFS info (device nvme1n1p2): scrub: started on devid 1"
"2025-11-10 01:44:22","notice","user","","machine1","[4043590.683686]
nvme nvme0: I/O tag 2 (a002) opcode 0x2 (I/O Cmd) QID 2 timeout,
aborting req_op:READ(0) size:131072"
"2025-11-10 01:44:22","notice","user","","machine1","[4043590.684742]
nvme nvme0: I/O tag 3 (c003) opcode 0x2 (I/O Cmd) QID 2 timeout,
aborting req_op:READ(0) size:131072"
"2025-11-10 01:44:22","notice","user","","machine1","[4043590.685778]
nvme nvme0: I/O tag 4 (1004) opcode 0x2 (I/O Cmd) QID 2 timeout,
aborting req_op:READ(0) size:131072"
"2025-11-10 01:44:22","notice","user","","machine1","[4043590.686802]
nvme nvme0: I/O tag 5 (c005) opcode 0x2 (I/O Cmd) QID 2 timeout,
aborting req_op:READ(0) size:131072"
"2025-11-10 01:44:22","notice","user","","machine1","[4043590.712274]
nvme nvme0: I/O tag 6 (0006) opcode 0x2 (I/O Cmd) QID 2 timeout,
aborting req_op:READ(0) size:131072"
"2025-11-10 01:44:22","notice","user","","machine1","[4043590.712483]
nvme nvme0: I/O tag 7 (4007) opcode 0x2 (I/O Cmd) QID 2 timeout,
aborting req_op:READ(0) size:131072"
"2025-11-10 01:44:22","notice","user","","machine1","[4043590.712698]
nvme nvme0: I/O tag 8 (3008) opcode 0x2 (I/O Cmd) QID 2 timeout,
aborting req_op:READ(0) size:131072"
"2025-11-10 01:44:22","notice","user","","machine1","[4043590.712906]
nvme nvme0: I/O tag 9 (5009) opcode 0x2 (I/O Cmd) QID 2 timeout,
aborting req_op:READ(0) size:131072"
"2025-11-10 01:44:53","notice","user","","machine1","[4043621.419061]
nvme nvme0: I/O tag 2 (a002) opcode 0x2 (I/O Cmd) QID 2 timeout, reset
controller"
"2025-11-10 01:46:14","notice","user","","machine1","[4043702.821452]
nvme nvme0: Device not ready; aborting reset, CSTS=0x1"
"2025-11-10 01:46:14","notice","user","","machine1","[4043702.850059]
nvme nvme0: Abort status: 0x371"
"2025-11-10 01:46:14","notice","user","","machine1","[4043702.850351]
nvme nvme0: Abort status: 0x371"
"2025-11-10 01:46:14","notice","user","","machine1","[4043702.850630]
nvme nvme0: Abort status: 0x371"
"2025-11-10 01:46:14","notice","user","","machine1","[4043702.850903]
nvme nvme0: Abort status: 0x371"
"2025-11-10 01:46:14","notice","user","","machine1","[4043702.851173]
nvme nvme0: Abort status: 0x371"
"2025-11-10 01:46:14","notice","user","","machine1","[4043702.851440]
nvme nvme0: Abort status: 0x371"
"2025-11-10 01:46:14","notice","user","","machine1","[4043702.851705]
nvme nvme0: Abort status: 0x371"
"2025-11-10 01:46:14","notice","user","","machine1","[4043702.851971]
nvme nvme0: Abort status: 0x371"
"2025-11-10 01:46:34","notice","user","","machine1","[4043722.877036]
nvme nvme0: Device not ready; aborting reset, CSTS=0x1"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.941289]
pcieport 0000:00:1b.4: AER: Multiple Uncorrectable (Non-Fatal) error
message received from 0000:06:00.0"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.942254]
nvme nvme0: Disabling device after reset failure: -19"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.969046]
I/O error, dev nvme0n1, sector 86788096 op 0x1:(WRITE) flags 0x100000
phys_seg 1 prio class 2"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.969083]
I/O error, dev nvme0n1, sector 464581888 op 0x0:(READ) flags 0x4000
phys_seg 3 prio class 3"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.969128]
I/O error, dev nvme0n1, sector 45027984 op 0x1:(WRITE) flags 0x4000800
phys_seg 1 prio class 2"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.969127]
I/O error, dev nvme0n1, sector 117882144 op 0x1:(WRITE) flags 0x1800
phys_seg 8 prio class 2"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.969146]
BTRFS error (device nvme1n1p2): bdev /dev/nvme0n1p2 errs: wr 2, rd 0,
flush 0, corrupt 0, gen 0"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.969146]
BTRFS error (device nvme1n1p2): bdev /dev/nvme0n1p2 errs: wr 2, rd 0,
flush 0, corrupt 0, gen 0"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.969163]
BTRFS error (device nvme1n1p2): bdev /dev/nvme0n1p2 errs: wr 4, rd 0,
flush 0, corrupt 0, gen 0"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.969160]
BTRFS error (device nvme1n1p2): bdev /dev/nvme0n1p2 errs: wr 3, rd 0,
flush 0, corrupt 0, gen 0"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.969167]
BTRFS error (device nvme1n1p2): bdev /dev/nvme0n1p2 errs: wr 5, rd 0,
flush 0, corrupt 0, gen 0"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.969167]
BTRFS error (device nvme1n1p2): bdev /dev/nvme0n1p2 errs: wr 6, rd 0,
flush 0, corrupt 0, gen 0"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.969179]
BTRFS error (device nvme1n1p2): bdev /dev/nvme0n1p2 errs: wr 7, rd 0,
flush 0, corrupt 0, gen 0"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.969179]
BTRFS error (device nvme1n1p2): bdev /dev/nvme0n1p2 errs: wr 8, rd 0,
flush 0, corrupt 0, gen 0"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.969184]
BTRFS error (device nvme1n1p2): bdev /dev/nvme0n1p2 errs: wr 9, rd 0,
flush 0, corrupt 0, gen 0"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.969187]
BTRFS error (device nvme1n1p2): bdev /dev/nvme0n1p2 errs: wr 10, rd 0,
flush 0, corrupt 0, gen 0"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.972280]
BTRFS warning (device nvme1n1p2): lost super block write due to IO
error on /dev/nvme0n1p2 (-5)"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.972938]
BTRFS error (device nvme1n1p2): fixed up error at logical 782844493824
on dev /dev/nvme0n1p2 physical 237354287104"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.972957]
BTRFS error (device nvme1n1p2): fixed up error at logical 782844821504
on dev /dev/nvme0n1p2 physical 237354614784"
"2025-11-10 01:46:35","notice","user","","machine1","[4043722.972960]
BTRFS error (device nvme1n1p2): fixed up error at logical 782844821504
on dev /dev/nvme0n1p2 physical 237354614784"
[ ..]


> > Chris Murphy
> >
> Isolate the problem between kernel and SSD FW by:-
>
> 1. run the same workload on different vendor SSDs.
> 2. run the same workload on qemu nvme emulation.
>
> This will allow you to remove the SSD FW out of this question.

Got it, I was running 4B2QJXD7 firmware on both drives when this issue
occurred.  Recently, Samsung released a new NVME F/W update 7B2QJXD7,
which they note "address(es) an intermittent non-recognition and blue
screen issue."  I've flashed the drives to 7B2QJXD7 and will see if
this issue recurs.

>
> -ck
>
>



More information about the Linux-nvme mailing list