[Bug 219609] File corruptions on SSD in 1st M.2 socket of AsRock X600M-STX + Ryzen 8700G
Bruno Gravato
bgravato at gmail.com
Sun Feb 2 00:32:31 PST 2025
I just realized I replied only to the bugzilla list. Sorry about that.
So I'm forwarding my reply to everyone else who was in CC and may not
be getting the bugzilla emails.
> > Is there any characterisation of the corrupted data; last time I
> > looked at the bz there wasn't.
>
> Yes, there is. (And I already reported it at least on the Debian bug
> tracker, see links in the initial message.)
>
> f3 reports overwritten sectors, i.e. it looks like the pseudo-random
> test pattern is written to wrong position. These corruptions occur in
> clusters whose size is an integer multiple of 2^17 bytes in most cases
> (about 80%) and 2^15 in all cases.
>
> The frequency of these corruptions is roughly 1 cluster per 50 GB written.
>
> Can others confirm this or do they observe a different characteristic?
In my tests I was using real data: a backup of my files.
On one such test I copied over 300K files, variables sizes and types
totalling about 60GB. A bit over 20 files got corrupted.
I tried copying the files over the network (ethernet) using rsync/ssh.
I also tried restoring the files using restic (over ssh as well). And
I also tried copying the files locally from a SATA disk. In all cases
I got similar results with some files being corrupted.
The destination nvme disk was using btrfs and running btrfs scrub
after the copy detects quite a few checksum errors.
I analyzed some of those corrupted files and one of them happened to
be a text file (linux kernel source code).
A big portion of the text was replaced with text from another file in
the same directory (being text made it easy to find where it came
from).
So this was a contiguous block of text that was overwritten with a
contiguous block of text from another file.
If I remember correctly the other file was not corrupted (so the
blocks weren't swapped). It looked like a certain block of text was
written twice: on the correct file and on another file in the same
directory.
I also got some jpeg images corrupted. I was able to open and view
(partially) those images and it looked like a portion of the image was
repeated in a different part of it), so blocks of the same file were
probably duplicated and overwritten within itself.
The blocks being overwritten seemed to be different sizes on different files.
Bruno
More information about the Linux-nvme
mailing list