blktests nvme/039 failure
alan.adamson at oracle.com
alan.adamson at oracle.com
Mon Apr 10 16:06:50 PDT 2023
On 4/10/23 4:49 AM, Shin'ichiro Kawasaki wrote:
> Hello Alan,
>
> I noticed that recently nvme/039 fails on my system occasionally (around 40%).
> The failure messages are as follows:
>
> nvme/039 => nvme0n1 (test error logging) [failed]
> runtime 0.176s ... 0.167s
> --- tests/nvme/039.out 2023-04-06 10:11:07.925670528 +0900
> +++ /home/shin/Blktests/blktests/results/nvme0n1/nvme/039.out.bad 2023-04-10 20:15:07.679538017 +0900
> @@ -1,5 +1,2 @@
> Running nvme/039
> - Read(0x2) @ LBA 0, 1 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) DNR
> - Read(0x2) @ LBA 0, 1 blocks, Unknown (sct 0x3 / sc 0x75) DNR
> - Write(0x1) @ LBA 0, 1 blocks, Write Fault (sct 0x2 / sc 0x80) DNR
> Test complete
>
> nvme/039 => nvme0n1 (test error logging) [failed]
> runtime 0.167s ... 0.199s
> --- tests/nvme/039.out 2023-04-06 10:11:07.925670528 +0900
> +++ /home/shin/Blktests/blktests/results/nvme0n1/nvme/039.out.bad 2023-04-10 20:15:09.114539650 +0900
> @@ -1,5 +1,4 @@
> Running nvme/039
> - Read(0x2) @ LBA 0, 1 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) DNR
> Read(0x2) @ LBA 0, 1 blocks, Unknown (sct 0x3 / sc 0x75) DNR
> Write(0x1) @ LBA 0, 1 blocks, Write Fault (sct 0x2 / sc 0x80) DNR
> Test complete
>
> It looks that expected error messages were not reported.
>
> I suspect that the time duration is too short between error injection enable
> and I/O to trigger the error. With the one line change below to add wait after
> the error injection enable, the failures disappear. Do you think such wait is
> the valid fix?
>
> tests/nvme/rc | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/tests/nvme/rc b/tests/nvme/rc
> index 210a82a..7043c23 100644
> --- a/tests/nvme/rc
> +++ b/tests/nvme/rc
> @@ -652,6 +652,7 @@ _nvme_enable_err_inject()
> echo "$4" > /sys/kernel/debug/"$1"/fault_inject/dont_retry
> echo "$5" > /sys/kernel/debug/"$1"/fault_inject/status
> echo "$6" > /sys/kernel/debug/"$1"/fault_inject/times
> + sleep 0.1
> }
>
> _nvme_disable_err_inject()
I've been able to reproduce it. The sleep .1 helps but doesn't
eliminate the issue. I did notice whenever there was a failure, there
was also a "blk_print_req_error: 2 callbacks suppressed" in the log
which would break the parsing the test needs to do.
Alan
More information about the Linux-nvme
mailing list