blktests failures with v6.8 kernel

Shinichiro Kawasaki shinichiro.kawasaki at wdc.com
Mon Mar 18 23:45:14 PDT 2024


Hi all,

I ran the latest blktests (git hash: 607513e64e48) with the v6.8 kernel, and I
observed three failures. I also checked CKI project blktests runs with the v6.8
kernel, and found two failures. In total, five failure symptoms are observed as
listed below.

Compared with the v6.8-rc1 kernel [1], nvme test group has greatly improved for
fc transport (Thanks go to Daniel). Now test runs do not hang. A few test cases
still fail, but it is a great improvement :)

[1] https://lore.kernel.org/linux-block/44i4y3fyqcz6k2pmum6toqylc2lvveb7x37ngskzfof52hoi2r@vxdxdnmggbj5/

List of failures
================
#1: block/011
#2: nvme/041,044 (fc transport)
#3: srp/002, 011 (rdma_rxe driver)
#4: nbd/002 (CKI failure)
#5: zbd/010 (CKI failure)

Failure description
===================

#1: block/011

   The test case fails with NVME devices due to lockdep WARNING "possible
   circular locking dependency detected". Reported in Sep/2022 [2]. In LSF
   2023, it was noted that this failure should be fixed. A RFC fix patch was
   posted recently [3]. It still needs more discussion to be fixed.

   [2] https://lore.kernel.org/linux-block/20220930001943.zdbvolc3gkekfmcv@shindev/
   [3] https://lore.kernel.org/linux-nvme/20231213051704.783490-1-shinichiro.kawasaki@wdc.com/

#2: nvme/041,044 (fc transport)

   With the trtype=fc configuration, nvme/041 and 044 fail with similar
   error messages:

  nvme/041 (Create authenticated connections)                  [failed]
      runtime  2.677s  ...  4.823s
      --- tests/nvme/041.out      2023-11-29 12:57:17.206898664 +0900
      +++ /home/shin/Blktests/blktests/results/nodev/nvme/041.out.bad     2024-03-19 14:50:56.399101323 +0900
      @@ -2,5 +2,5 @@
       Test unauthenticated connection (should fail)
       disconnected 0 controller(s)
       Test authenticated connection
      -disconnected 1 controller(s)
      +disconnected 0 controller(s)
       Test complete
  nvme/044 (Test bi-directional authentication)                [failed]
      runtime  4.740s  ...  7.482s
      --- tests/nvme/044.out      2023-11-29 12:57:17.212898647 +0900
      +++ /home/shin/Blktests/blktests/results/nodev/nvme/044.out.bad     2024-03-19 14:51:08.062067741 +0900
      @@ -4,7 +4,7 @@
       Test invalid ctrl authentication (should fail)
       disconnected 0 controller(s)
       Test valid ctrl authentication
      -disconnected 1 controller(s)
      +disconnected 0 controller(s)
       Test invalid ctrl key (should fail)
       disconnected 0 controller(s)
      ...
      (Run 'diff -u tests/nvme/044.out /home/shin/Blktests/blktests/results/nodev/nvme/044.out.bad' to see the entire diff)

#3: srp/002, 011 (rdma_rxe driver)

   Test process hang is observed occasionally. Reported to the relevant mailing
   lists in Aug/2023 [4]. Blktests was modified to change the default driver
   from rdma_rxe to siw to avoid impacts on blktests users. The root cause is
   not yet understood.

   [4] https://lore.kernel.org/linux-rdma/18a3ae8c-145b-4c7f-a8f5-67840feeb98c@acm.org/T/#mee9882c2cfd0cfff33caa04e75418576f4c7a789

#4: nbd/002 (CKI failure)

   CKI reported the failure [5]. I confirmed the test case fail occasionally on
   my test machine. I think blktests script can be improved to avoid the
   failure. I plan to post a fix candidate patch.

  nbd/002 (tests on partition handling for an nbd device)      [failed]
      runtime    ...  0.414s
      --- tests/nbd/002.out       2024-02-19 19:25:07.453721466 +0900
      +++ /home/shin/kts/kernel-test-suite/sets/blktests/log/runlog/nodev/nbd/002.out.bad 2024-03-19 14:53:56.320218177 +0900
      @@ -1,4 +1,4 @@
       Running nbd/002
       Testing IOCTL path
       Testing the netlink path
      -Test complete
      +Didn't have partition on the netlink path

   [5] https://datawarehouse.cki-project.org/kcidb/tests/11634679

#5: zbd/010 (CKI failure)

   CKI observed the failure [6], and Yi Zhang reported it to relevant mailing
   lists [7]. Though the WARN was observed with the test case zbd/010 for zoned
   block devices, it can be recreated with non-zoned regular block devices, when
   f2fs is set up with multiple block devices. A fix in F2FS is expected.

   [6] https://datawarehouse.cki-project.org/issue/2508
   [7] https://lore.kernel.org/linux-f2fs-devel/CAHj4cs-kfojYC9i0G73PRkYzcxCTex=-vugRFeP40g_URGvnfQ@mail.gmail.com/


More information about the Linux-nvme mailing list