blktests failures with v5.19-rc1

Shinichiro Kawasaki shinichiro.kawasaki at wdc.com
Mon Jun 13 21:00:45 PDT 2022


On Jun 14, 2022 / 02:38, Chaitanya Kulkarni wrote:
> Shinichiro,
> 
> On 6/13/22 19:23, Keith Busch wrote:
> > On Tue, Jun 14, 2022 at 01:09:07AM +0000, Shinichiro Kawasaki wrote:
> >> (CC+: linux-pci)
> >> On Jun 11, 2022 / 16:34, Yi Zhang wrote:
> >>> On Fri, Jun 10, 2022 at 10:49 PM Keith Busch <kbusch at kernel.org> wrote:
> >>>>
> >>>> And I am not even sure this is real. I don't know yet why this is showing up
> >>>> only now, but this should fix it:
> >>>
> >>> Hi Keith
> >>>
> >>> Confirmed the WARNING issue was fixed with the change, here is the log:
> >>
> >> Thanks. I also confirmed that Keith's change to add __ATTR_IGNORE_LOCKDEP to
> >> dev_attr_dev_rescan avoids the fix, on v5.19-rc2.
> >>
> >> I took a closer look into this issue and found The deadlock WARN can be
> >> recreated with following two commands:
> >>
> >> # echo 1 > /sys/bus/pci/devices/0000\:00\:09.0/rescan
> >> # echo 1 > /sys/bus/pci/devices/0000\:00\:09.0/remove
> >>
> >> And it can be recreated with PCI devices other than NVME controller, such as
> >> SCSI controller or VGA controller. Then this is not a storage sub-system issue.
> >>
> >> I checked function call stacks of the two commands above. As shown below, it
> >> looks like ABBA deadlock possibility is detected and warned.
> > 
> > Yeah, I was mistaken on this report, so my proposal to suppress the warning is
> > definitely not right. If I run both 'echo' commands in parallel, I see it
> > deadlock frequently. I'm not familiar enough with this code to any good ideas
> > on how to fix, but I agree this is a generic pci issue.
> 
> I think it is worth adding a testcase to blktests to make sure these
> future releases will test this.

Yeah, this WARN is confusing for us then it would be valuable to test by
blktests not to repeat it. One point I wonder is: which test group the test case
will it fall in? The nvme group could be the group to add, probably.

Another point I wonder is other kernel test suite than blktests. Don't we have
more appropriate test suite to check PCI device rescan/remove race ? Such a test
sounds more like a PCI bus sub-system test than block/storage test.

Having said that, still I think the test case is valuable for block/storage.
Unless anyone opposes, I'm open for the patch to add it.

-- 
Shin'ichiro Kawasaki


More information about the Linux-nvme mailing list