nvme nvme0: I/O 0 (I/O Cmd) QID 1 timeout, aborting, source drive corruption observed
Christoph Hellwig
hch at lst.de
Thu Dec 15 00:23:44 PST 2022
On Thu, Dec 15, 2022 at 10:38:33AM +0900, J. Hart wrote:
> I am attempting to load an nvme device (nvme0n1) to use as main system
> drive using the following command:
>
> rsync -axvH /. --exclude=/lost+found --exclude=/var/log.bu
> --exclude=/usr/var/log.bu --exclude=/usr/X11R6/var/log.bu
> --exclude=/home/jhart/.cache/mozilla/firefox/are7uokl.default-release/cache2.bu
> --exclude=/home/jhart/.cache/thunderbird/7zsnqnss.default/cache2.bu
> /mnt/root_new 2>&1 | tee root.log
>
> The total transfer would be approximately 50 GB. This is being done at run
> level 1, and only the kernel threads and the root shell are observed to be
> active.
>
> The following log messages appear after a minute or so, and rsync hangs.
> The nvme drive cannot be unmounted without a reboot.
Ok, this looks like the driver has firmware / hardware problems and
can't copy wit hthe load.
>
> dmesg reports the following:
nvme0 is the destination driver I guess?
>
> [Dec14 19:24] nvme nvme0: I/O 0 (I/O Cmd) QID 1 timeout, aborting
Can you enable CONFIG_NVME_VERBOSE_ERRORS so that we can see what
commands are hanging?
> I have also observed file system corruption on the source drive of the
> transfer. I would not normally think this to be related, except that after
> the first time I observed it, I made certain that I corrected the file
> content before any additional attempts, but have seen this again after
> every attempt. The modification dates and file sizes did not change, but
> the file content on the source drive did. I confirmed this using the
> "diff" utility, and again using a rsync dry run with the check sum test
> enabled.
Ok, that's really odd. The only way I could think of that happening
is if the driver does stay DMAs, which would be really grave.
Do you have CONFIG_INTEL_IOMMU and CONFIG_INTEL_IOMMU_DEFAULT_ON enabled?
If not, it would be good to enable those to see if the iommu catches
any stray DMAs.
More information about the Linux-nvme
mailing list