blktests failures with v5.19-rc1

Chaitanya Kulkarni chaitanyak at nvidia.com
Thu Jun 16 10:55:42 PDT 2022


On 6/15/2022 9:42 PM, Shinichiro Kawasaki wrote:
> On Jun 16, 2022 / 07:13, Yi Zhang wrote:
>> On Thu, Jun 16, 2022 at 6:01 AM Chaitanya Kulkarni
>> <chaitanyak at nvidia.com> wrote:
>>>
>>> On 6/15/22 12:47, Bjorn Helgaas wrote:
>>>> On Tue, Jun 14, 2022 at 04:00:45AM +0000, Shinichiro Kawasaki wrote:
>>>>> On Jun 14, 2022 / 02:38, Chaitanya Kulkarni wrote:
>>>>>> Shinichiro,
> 
> [snip]
> 
>>>>>> I think it is worth adding a testcase to blktests to make sure
>>>>>> these future releases will test this.
>>>>>
>>>>> Yeah, this WARN is confusing for us then it would be valuable to
>>>>> test by blktests not to repeat it. One point I wonder is: which test
>>>>> group the test case will it fall in? The nvme group could be the
>>>>> group to add, probably.
>>>>>
>>>
>>> since this issue been discovered with nvme rescan and revmoe,
>>> it should be added to the nvme category.
>>
>> We already have nvme/032 which tests nvme rescan/reset/remove and the
>> issue was reported by running this one, do we still need one more?
> 
> That is a point. Current nvme/032 checks nvme pci adapter rescan/reset/remove
> during I/O to catch problems in nvme driver and block layer, but actually it
> can catch the problem in pci sub-system also. I think Chaitanya's motivation
> for the new test case is to distinguish those two.
> 

Yes, exactly.

> If we have the new test case, its code will be similar and duplicated as
> nvme/032 code. To avoid such duplication, it would be good to improve nvme/032
> to have two steps. The 1st step checks that nvme pci adapter rescan/reset/remove
> without I/O causes no kernel WARN (or any other unexpected kernel messages). Any
> issue found in this step is reported as a pci sub-system issue. The 2nd step
> checks nvme pci adapter rescan/reset/remove during I/O, as the current nvme/032
> does. With this, we don't need the new test case, but still we can distinguish
> the problems in nvme/block sub-system and pci sub-system.
> 

Totally agree with this.

>>>>> Another point I wonder is other kernel test suite than blktests.
>>>>> Don't we have more appropriate test suite to check PCI device
>>>>> rescan/remove race ? Such a test sounds more like a PCI bus
>>>>> sub-system test than block/storage test.
>>>
>>> I don't think so we could have caught it long time back,
>>> but we clearly did not.
> 
> I see, then it looks that blktests is the test suite to test it.
> 

-ck




More information about the Linux-nvme mailing list