[bug report] most of blktests nvme/ failed on the latest linux tree

Yi Zhang yi.zhang at redhat.com
Tue Jun 27 19:58:29 PDT 2023


Hi Chaitanya

Here is the code I used, and I will try to bisect it later today.

# git remote -v
origin https://github.com/torvalds/linux.git (fetch)
origin https://github.com/torvalds/linux.git (push)
# git log -1 --oneline
98be618ad030 (HEAD -> master, origin/master, origin/HEAD) Merge tag
'Smack-for-6.5' of https://github.com/cschaufler/smack-next

# cd ../blktests/
# git remote -v
origin https://github.com/osandov/blktests (fetch)
origin https://github.com/osandov/blktests (push)
# git log -1 --oneline
154e652 (HEAD -> master, origin/master, origin/HEAD) block/034: Test
memory is released by null-blk driver with memory_backed=1


On Wed, Jun 28, 2023 at 10:05 AM Chaitanya Kulkarni
<chaitanyak at nvidia.com> wrote:
>
> On 6/27/23 15:37, Chaitanya Kulkarni wrote:
> > On 6/27/23 14:21, Sagi Grimberg wrote:
> >>> Hello
> >>>
> >>> I found this failure on the latest linux tree, and it cannot be
> >>> reproduced on v6.4,
> >>> it should be one regression recently merged to linux tree after v6.4.
> >>> I check the commit recently merged after v6.4, and found below commit
> >>> touched the related code, not sure if it was introduced by this
> >>> commit.
> >>>
> >>> commit 959ffef13bac792e4e2e3321d6e2bd2b00c0f5f9
> >>> Author: Chaitanya Kulkarni <kch at nvidia.com>
> >>> Date:   Thu Jun 1 23:47:42 2023 -0700
> >>>
> >>>       nvme-fabrics: open code __nvmf_host_find()
> >> That just moved code, no functional change,
> >> most likely the below was the offender:
> >> ae8bd606e09b ("nvme-fabrics: prevent overriding of existing host")
> > For nvme-6.5 blktests are passing with nvme_trtype "loop" and "tcp"
> > see [1] with following HEAD :-
> >
> > commit 99160af413b4ff1c3b4741e8a7583f8e7197f201 (origin/nvme-6.5)
> > Author: Sagi Grimberg <sagi at grimberg.me>
> > Date:   Tue Jun 20 16:07:36 2023 +0300
> >
> >       nvme-mpath: fix I/O failure with EAGAIN when failing over I/O
> >
> > Also, didn't find the following error messsage in dmesg.
> > blktests (master) # dmesg | grep "found same hostid"
> > blktests (master) #
> >
> > confused exactly how blktests are passing on nvme-6.5 branch ?
> >
> > I'll try linux v6.4 next and update you soon ..
> >
> > -ck
> >
> >
>
> I ran the blktests on linux-block/for-next with following HEAD:-
> commit 3261ea42710e9665c9151006049411bd23b5411f (origin/for-next)
> Merge: ad73f31646e0 6d85ebf95c44
> Author: Jens Axboe <axboe at kernel.dk>
> Date:   Mon Jun 26 09:53:41 2023 -0600
>
>      Merge branch 'for-6.5/block-late' into for-next
>
>      * for-6.5/block-late:
>        blk-sysfs: add a new attr_group for blk_mq
>        blk-iocost: move wbt_enable/disable_default() out of spinlock
>        blk-wbt: cleanup rwb_enabled() and wbt_disabled()
>        blk-wbt: remove dead code to handle wbt enable/disable with io
> inflight
>        blk-wbt: don't create wbt sysfs entry if CONFIG_BLK_WBT is disabled
>
>
>
> It's passing, can you pleas share :-
>
> 1. git repo link
> 2. git repo HEAD
> 3. blktests head
>
> I'll continue to debug the problem  ...
>
> -ck
>
> [1]
> linux-block (for-next) # git log -2
> commit 0be6b9ec7bd1aa1b5629ebf2701fa3aa7237d313 (HEAD -> for-next)
> Merge: 7bbb53e97bb3 3261ea42710e
> Author: Chaitanya Kulkarni <kch at nvidia.com>
> Date:   Tue Jun 27 15:40:14 2023 -0700
>
>      Merge branch 'for-next' of git://git.kernel.dk/linux-block into
> for-next
>
> commit 3261ea42710e9665c9151006049411bd23b5411f (origin/for-next)
> Merge: ad73f31646e0 6d85ebf95c44
> Author: Jens Axboe <axboe at kernel.dk>
> Date:   Mon Jun 26 09:53:41 2023 -0600
>
>      Merge branch 'for-6.5/block-late' into for-next
>
>      * for-6.5/block-late:
>        blk-sysfs: add a new attr_group for blk_mq
>        blk-iocost: move wbt_enable/disable_default() out of spinlock
>        blk-wbt: cleanup rwb_enabled() and wbt_disabled()
>        blk-wbt: remove dead code to handle wbt enable/disable with io
> inflight
>        blk-wbt: don't create wbt sysfs entry if CONFIG_BLK_WBT is disabled
> linux-block (for-next) # cdblktests
> blktests (master) # ./test-nvme.sh
> ################nvme_trtype=loop############
> nvme/002 (create many subsystems and test discovery) [passed]
>      runtime    ...  19.523s
> nvme/003 (test if we're sending keep-alives to a discovery controller)
> [passed]
>      runtime  10.095s  ...  10.082s
> nvme/004 (test nvme and nvmet UUID NS descriptors) [passed]
>      runtime  1.458s  ...  1.456s
> nvme/005 (reset local loopback target) [passed]
>      runtime  1.199s  ...  1.808s
> nvme/006 (create an NVMeOF target with a block device-backed ns) [passed]
>      runtime  0.059s  ...  0.066s
> nvme/007 (create an NVMeOF target with a file-backed ns) [passed]
>      runtime  0.044s  ...  0.033s
> nvme/008 (create an NVMeOF host with a block device-backed ns) [passed]
>      runtime  1.148s  ...  1.476s
> nvme/009 (create an NVMeOF host with a file-backed ns) [passed]
>      runtime  1.139s  ...  1.435s
> nvme/010 (run data verification fio job on NVMeOF block device-backed
> ns) [passed]
>      runtime  113.605s  ...  113.720s
> nvme/011 (run data verification fio job on NVMeOF file-backed ns) [passed]
>      runtime  83.124s  ...  80.072s
> nvme/012 (run mkfs and data verification fio job on NVMeOF block
> device-backed ns) [passed]
>      runtime  86.374s  ...  93.449s
> nvme/013 (run mkfs and data verification fio job on NVMeOF file-backed
> ns) [passed]
>      runtime  78.565s  ...  64.751s
> nvme/014 (flush a NVMeOF block device-backed ns) [passed]
>      runtime  6.617s  ...  7.489s
> nvme/015 (unit test for NVMe flush for file backed ns) [passed]
>      runtime  5.939s  ...  6.214s
> nvme/016 (create/delete many NVMeOF block device-backed ns and test
> discovery) [passed]
>      runtime    ...  12.590s
> nvme/017 (create/delete many file-ns and test discovery) [passed]
>      runtime    ...  12.932s
> nvme/018 (unit test NVMe-oF out of range access on a file backend) [passed]
>      runtime  1.120s  ...  1.422s
> nvme/019 (test NVMe DSM Discard command on NVMeOF block-device ns) [passed]
>      runtime  1.148s  ...  1.437s
> nvme/020 (test NVMe DSM Discard command on NVMeOF file-backed ns) [passed]
>      runtime  1.124s  ...  1.437s
> nvme/021 (test NVMe list command on NVMeOF file-backed ns) [passed]
>      runtime  1.127s  ...  1.414s
> nvme/022 (test NVMe reset command on NVMeOF file-backed ns) [passed]
>      runtime  1.164s  ...  1.767s
> nvme/023 (test NVMe smart-log command on NVMeOF block-device ns) [passed]
>      runtime  1.151s  ...  1.442s
> nvme/024 (test NVMe smart-log command on NVMeOF file-backed ns) [passed]
>      runtime  1.112s  ...  1.440s
> nvme/025 (test NVMe effects-log command on NVMeOF file-backed ns) [passed]
>      runtime  1.121s  ...  1.420s
> nvme/026 (test NVMe ns-descs command on NVMeOF file-backed ns) [passed]
>      runtime  1.121s  ...  1.418s
> nvme/027 (test NVMe ns-rescan command on NVMeOF file-backed ns) [passed]
>      runtime  1.121s  ...  1.430s
> nvme/028 (test NVMe list-subsys command on NVMeOF file-backed ns) [passed]
>      runtime  1.130s  ...  1.415s
> nvme/029 (test userspace IO via nvme-cli read/write interface) [passed]
>      runtime  1.261s  ...  1.564s
> nvme/030 (ensure the discovery generation counter is updated
> appropriately) [passed]
>      runtime  0.115s  ...  0.205s
> nvme/031 (test deletion of NVMeOF controllers immediately after setup)
> [passed]
>      runtime  0.802s  ...  3.861s
> nvme/038 (test deletion of NVMeOF subsystem without enabling) [passed]
>      runtime  0.015s  ...  0.012s
> nvme/040 (test nvme fabrics controller reset/disconnect operation during
> I/O) [passed]
>      runtime  7.296s  ...  8.020s
> nvme/041 (Create authenticated connections) [passed]
>      runtime  0.448s  ...  0.759s
> nvme/042 (Test dhchap key types for authenticated connections) [passed]
>      runtime  2.757s  ...  4.790s
> nvme/043 (Test hash and DH group variations for authenticated
> connections) [passed]
>      runtime  0.705s  ...  6.904s
> nvme/044 (Test bi-directional authentication) [passed]
>      runtime  1.227s  ...  1.835s
> nvme/045 (Test re-authentication) [passed]
>      runtime  3.646s  ...  3.791s
> nvme/047 (test different queue types for fabric transports)  [not run]
>      runtime  1.718s  ...
>      nvme_trtype=loop is not supported in this test
> nvme/048 (Test queue count changes on reconnect)             [not run]
>      runtime  6.242s  ...
>      nvme_trtype=loop is not supported in this test
> ################nvme_trtype=tcp############
> nvme/002 (create many subsystems and test discovery)         [not run]
>      runtime  19.523s  ...
>      nvme_trtype=tcp is not supported in this test
> nvme/003 (test if we're sending keep-alives to a discovery controller)
> [passed]
>      runtime  10.082s  ...  10.087s
> nvme/004 (test nvme and nvmet UUID NS descriptors) [passed]
>      runtime  1.456s  ...  1.145s
> nvme/005 (reset local loopback target) [passed]
>      runtime  1.808s  ...  1.210s
> nvme/006 (create an NVMeOF target with a block device-backed ns) [passed]
>      runtime  0.066s  ...  0.055s
> nvme/007 (create an NVMeOF target with a file-backed ns) [passed]
>      runtime  0.033s  ...  0.045s
> nvme/008 (create an NVMeOF host with a block device-backed ns) [passed]
>      runtime  1.476s  ...  1.164s
> nvme/009 (create an NVMeOF host with a file-backed ns) [passed]
>      runtime  1.435s  ...  1.120s
> nvme/010 (run data verification fio job on NVMeOF block device-backed
> ns) [passed]
>      runtime  113.720s  ...  93.167s
> nvme/011 (run data verification fio job on NVMeOF file-backed ns) [passed]
>      runtime  80.072s  ...  81.600s
> nvme/012 (run mkfs and data verification fio job on NVMeOF block
> device-backed ns) [passed]
>      runtime  93.449s  ...  95.264s
> nvme/013 (run mkfs and data verification fio job on NVMeOF file-backed
> ns) [passed]
>      runtime  64.751s  ...  66.072s
> nvme/014 (flush a NVMeOF block device-backed ns) [passed]
>      runtime  7.489s  ...  6.727s
> nvme/015 (unit test for NVMe flush for file backed ns) [passed]
>      runtime  6.214s  ...  5.914s
> nvme/016 (create/delete many NVMeOF block device-backed ns and test
> discovery) [not run]
>      runtime  12.590s  ...
>      nvme_trtype=tcp is not supported in this test
> nvme/017 (create/delete many file-ns and test discovery)     [not run]
>      runtime  12.932s  ...
>      nvme_trtype=tcp is not supported in this test
> nvme/018 (unit test NVMe-oF out of range access on a file backend) [passed]
>      runtime  1.422s  ...  1.121s
> nvme/019 (test NVMe DSM Discard command on NVMeOF block-device ns) [passed]
>      runtime  1.437s  ...  1.161s
> nvme/020 (test NVMe DSM Discard command on NVMeOF file-backed ns) [passed]
>      runtime  1.437s  ...  1.122s
> nvme/021 (test NVMe list command on NVMeOF file-backed ns) [passed]
>      runtime  1.414s  ...  1.123s
> nvme/022 (test NVMe reset command on NVMeOF file-backed ns) [passed]
>      runtime  1.767s  ...  1.172s
> nvme/023 (test NVMe smart-log command on NVMeOF block-device ns) [passed]
>      runtime  1.442s  ...  1.144s
> nvme/024 (test NVMe smart-log command on NVMeOF file-backed ns) [passed]
>      runtime  1.440s  ...  1.122s
> nvme/025 (test NVMe effects-log command on NVMeOF file-backed ns) [passed]
>      runtime  1.420s  ...  1.119s
> nvme/026 (test NVMe ns-descs command on NVMeOF file-backed ns) [passed]
>      runtime  1.418s  ...  1.114s
> nvme/027 (test NVMe ns-rescan command on NVMeOF file-backed ns) [passed]
>      runtime  1.430s  ...  1.140s
> nvme/028 (test NVMe list-subsys command on NVMeOF file-backed ns) [passed]
>      runtime  1.415s  ...  1.117s
> nvme/029 (test userspace IO via nvme-cli read/write interface) [passed]
>      runtime  1.564s  ...  1.258s
> nvme/030 (ensure the discovery generation counter is updated
> appropriately) [passed]
>      runtime  0.205s  ...  0.137s
> nvme/031 (test deletion of NVMeOF controllers immediately after setup)
> [passed]
>      runtime  3.861s  ...  0.849s
> nvme/038 (test deletion of NVMeOF subsystem without enabling) [passed]
>      runtime  0.012s  ...  0.016s
> nvme/040 (test nvme fabrics controller reset/disconnect operation during
> I/O) [passed]
>      runtime  8.020s  ...  7.273s
> nvme/041 (Create authenticated connections) [passed]
>      runtime  0.759s  ...  0.440s
> nvme/042 (Test dhchap key types for authenticated connections) [passed]
>      runtime  4.790s  ...  2.712s
> nvme/043 (Test hash and DH group variations for authenticated
> connections) [passed]
>      runtime  6.904s  ...  0.731s
> nvme/044 (Test bi-directional authentication) [passed]
>      runtime  1.835s  ...  1.240s
> nvme/045 (Test re-authentication) [passed]
>      runtime  3.791s  ...  3.630s
> nvme/047 (test different queue types for fabric transports) [passed]
>      runtime    ...  1.701s
> nvme/048 (Test queue count changes on reconnect) [passed]
>      runtime    ...  6.244s
> blktests (master) #
>


-- 
Best Regards,
  Yi Zhang




More information about the Linux-nvme mailing list