[PATCH blktests 0/2] add nvme test for creating sleep while atomic kernel BUG
Shinichiro Kawasaki
shinichiro.kawasaki at wdc.com
Thu Dec 5 17:09:17 PST 2024
On Dec 03, 2024 / 14:35, Nilay Shroff wrote:
>
>
> On 12/3/24 13:56, Shinichiro Kawasaki wrote:
> > CC: Jirong,
> >
> > On Dec 03, 2024 / 11:08, Nilay Shroff wrote:
> >>
> >>
> >> On 11/30/24 14:40, Shinichiro Kawasaki wrote:
> >>> On Nov 29, 2024 / 13:31, Nilay Shroff wrote:
> >>>> Hi,
> >>>>
> >>>> There're two patches in this series. The first patch is a preparation patch
> >>>> for reusing a common function nvmf_wait_for_ns from multiple nvme test scripts.
> >>>> The second patch adds a new nvme regression[1] test for commit 505363957fad
> >>>> ("nvmet: fix nvme status code when namespace is disabled").
> >>>
> >>> Hi Nilay, thank your very much for the fix actions. Much appreciated.
> >>>
> >>> I tried these blktests patches with the kernel just before the commit
> >>> 505363957fad, at the git hash 6825bdde4434. I expected the new test case fails,
> >>> but it passes. I increased the number of iterations from 10 to 100, but it still
> >>> passes. Do you observe the test case failure on your test systems?
> >> If you ran blktests at git hash 6825bdde4434 then that doesn't include ns changes
> >> which uses mutex_lock. So it's expected that the test wouldn't fail. Did you try
> >> running test using latest upstream kernel or checking out tree at commit
> >> 505363957fad ("nvmet: fix nvme status code when namespace is disabled")?
> >
> > Yes, I see the test case creates "BUG: sleeping function called from invalid
> > context". Now I see that you added this new test case nvme/055 to recreate the
> > BUG. However, I observed the BUG at the first place with nvme/052, which can be
> > used to confirm the BUG fixed. So it does not look so meaningful to add the new
> > test case.
> >
> > I think my report "blktests failures with v6.12 kernel [X]" confused you. I
> > wrote "It is desired to have a better fix and the test case to confirm it.", but
> > I should have wrote "It is desired to have a better fix and the test case to
> > confirm that the fix does not break the intent of the trigger commit
> > 505363957fad." Please find the discussion about how to test the fix patch [Y].
> > The question was: how to "confirm that the commit 505363957fad achieves its
> > purpose" ?.
> >
> > [X] https://lore.kernel.org/linux-block/6crydkodszx5vq4ieox3jjpwkxtu7mhbohypy24awlo5w7f4k6@to3dcng24rd4/
> > [Y] https://lore.kernel.org/linux-nvme/20241023052042.GB1341@lst.de/
> >
> Yeah I read that explanation and I also saw in another discussion[Z] you mentioned,
> it's udev daemon manifesting this kernel BUG. So my initial thought was that relying
> on udev to recreate a bug may not be good idea because udev rules could change without
> out control and those rules could be different from one distro to another..
>
> Hence I thought about writing a test case so that we could reliably recreate this BUG
> without any external dependency...
>
> [Z] https://lore.kernel.org/linux-nvme/tqcy3sveity7p56v7ywp7ssyviwcb3w4623cnxj3knoobfcanq@yxgt2mjkbkam/
Now I see your motivation to add this test case. I suggest to add this
background to the test case block comment, something like,
... The regression was found with nvme/052, which depends on udev rules.
This test case catches the failure regardless of udev rule settings.
More information about the Linux-nvme
mailing list