[PATCH v2] tests/nvme: Add admin-passthru+reset race test

Keith Busch kbusch at kernel.org
Mon Nov 21 14:47:24 PST 2022


On Mon, Nov 21, 2022 at 03:34:44PM -0700, Jonathan Derrick wrote:
> On 11/21/2022 1:55 PM, Keith Busch wrote:
> > On Thu, Nov 17, 2022 at 02:22:10PM -0700, Jonathan Derrick wrote:
> >> I seem to have isolated the error mechanism for older kernels, but 6.2.0-rc2
> >> reliably segfaults my QEMU instance (something else to look into) and I don't
> >> have any 'real' hardware to test this on at the moment. It looks like several
> >> passthru commands are able to enqueue prior/during/after resetting/connecting.
> > 
> > I'm not seeing any problem with the latest nvme-qemu after several dozen
> > iterations of this test case. In that environment, the formats and
> > resets complete practically synchronously with the call, so everything
> > proceeds quickly. Is there anything special I need to change?
> >  
> I can still repro this with nvme-fixes tag, so I'll have to dig into it myself
> Does the tighter loop in the test comment header produce results?

My qemu's backing storage is a nullblk which makes format a no-op, but I
can try something slower if you think that will have different results.
These kinds of tests are definitely more pleasant to run under
emulation, so having the recipe to recreate there is a boon.
 
> >> The issue seems to be very heavily timing related, so the loop in the header is
> >> a lot more forceful in this approach.
> >>
> >> As far as the loop goes, I've noticed it will typically repro immediately or
> >> pass the whole test.
> > 
> > I can only get possible repro in scenarios that have multi-second long,
> > serialized format times. Even then, it still appears that everything
> > fixes itself after a waiting. Are you observing the same, or is it stuck
> > forever in your observations?
> In 5.19, it gets stuck forever with lots of formats outstanding and
> controller stuck in resetting. I'll keep digging. Thanks Keith

At the moment I'm interested in upstream, so either Linus' latest
6.1-rc, or the nvme-6.2 branch. If you can confirm these are okay (which
appears to be the case on my side), then I can definitely shift focus to
stable back-ports. But if they're not okay, then I want to straighten
that side out first.



More information about the Linux-nvme mailing list