blktests nvme/039 failure

Shin'ichiro Kawasaki shinichiro at fastmail.com
Wed Apr 12 01:49:30 PDT 2023


On Apr 11, 2023 / 13:55, alan.adamson at oracle.com wrote:
> 
> On 4/10/23 6:27 PM, Shin'ichiro Kawasaki wrote:
> > On Apr 10, 2023 / 16:06, alan.adamson at oracle.com wrote:
> > [...]
> > > I've been able to reproduce it.  The sleep .1 helps but doesn't eliminate
> > > the issue.  I did notice whenever there was a failure, there was also a
> > > "blk_print_req_error: 2 callbacks suppressed" in the log which would break
> > > the parsing the test needs to do.
> > Ah, I see. The error messages were no printed because pr_err_ratelimited()
> > suppressed them. AFAIK, there is no way to disable or relax the rate limit from
> > userland. I think sleep for 5 = DEFAULT_RATE_LIMIT seconds at the beginning of
> > the test will ensure the error message printed. It will also avoid the "x
> > callbacks suppressed" messages.
> > 
> > With the change below, I observe no failure on my system.
> > 
> > diff --git a/tests/nvme/039 b/tests/nvme/039
> > index 11d6d24..5d76297 100755
> > --- a/tests/nvme/039
> > +++ b/tests/nvme/039
> > @@ -139,6 +139,9 @@ test_device() {
> >   	_nvme_err_inject_setup "${ns_dev}" "${ctrl_dev}"
> > +	# wait DEFAULT_RATELIMIT_INTERVAL=5 seconds to ensure errors are printed
> > +	sleep 5
> > +
> >   	inject_unrec_read_on_read "${ns_dev}"
> >   	inject_invalid_status_on_read "${ns_dev}"
> >   	inject_write_fault_on_write "${ns_dev}"
> > 
> Looks good.  I tested it with my environment. 

I did further testing, and found the test case still fails... The 5 seconds wait
ensures the expected errors are printed, but still we have the chance of the
"callbacks suppressed" message when previous test cases have printed many
messages. Here's the kernel output I observed with the fix above.

  run blktests nvme/039 at 2023-04-12 13:52:19
  nvme0n1: I/O Cmd(0x2) @ LBA 0, 1 blocks, I/O Error (sct 0x2 / sc 0x81) DNR
  blk_print_req_error: 5804 callbacks suppressed
  critical medium error, dev nvme0n1, sector 0 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 2
  nvme0n1: I/O Cmd(0x2) @ LBA 0, 1 blocks, I/O Error (sct 0x3 / sc 0x75) DNR
  I/O error, dev nvme0n1, sector 0 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 2
  nvme0n1: I/O Cmd(0x1) @ LBA 0, 1 blocks, I/O Error (sct 0x2 / sc 0x80) DNR
  critical medium error, dev nvme0n1, sector 0 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2

I think further modification is required to grep -v the "callbacks suppressed".
Will post a patch with this modification to ask your comment.

> When we get the passthru
> logging in,  I'll be adding more tests here, but I'll have to do another
> sleep 5.

Agreed.



More information about the Linux-nvme mailing list