blktests failures with v6.12-rc1 kernel
Shinichiro Kawasaki
shinichiro.kawasaki at wdc.com
Thu Oct 3 01:02:57 PDT 2024
Hi all,
I ran the latest blktests (git hash: 80430afc5589) with the v6.12-rc1 kernel,
and I observed three failure symptoms listed below.
Comparing with the previous report using the v6.11 kernel [1],
- v6.12 kernel has one new failure symptom #3 in srp test group, and,
- v6.12 kernel has one less failure, which was observed with the test case
scsi/008. It was addressed in the kernel side.
[1] https://lore.kernel.org/linux-block/3aydm6iazrkdxb4d5yb3tc7fjqax6nvukrn3tpvzjcom6woc5g@qbai6zlvsrbs/
List of failures
================
#1: nvme/014 (tcp transport)
#2: nvme/041 (fc transport)
#3: srp/001,002,011,012,013,014,016
Failure description
===================
#1: nvme/014 (tcp transport)
With the trtype=tcp configuration, nvme/014 fails occasionally with the
kernel message "DEBUG_LOCKS_WARN_ON(lock->magic != lock)". It is rare, and
200 times of repeat is required to recreate the failure. A fix patch
candidate was posted [2].
[2] https://lore.kernel.org/linux-nvme/20241002045141.1975881-1-shinichiro.kawasaki@wdc.com/
#2: nvme/041 (fc transport)
With the trtype=fc configuration, nvme/041 fails:
nvme/041 (Create authenticated connections) [failed]
runtime 2.677s ... 4.823s
--- tests/nvme/041.out 2023-11-29 12:57:17.206898664 +0900
+++ /home/shin/Blktests/blktests/results/nodev/nvme/041.out.bad 2024-03-19 14:50:56.399101323 +0900
@@ -2,5 +2,5 @@
Test unauthenticated connection (should fail)
disconnected 0 controller(s)
Test authenticated connection
-disconnected 1 controller(s)
+disconnected 0 controller(s)
Test complete
nvme/044 had same failure symptom until the kernel v6.9. A solution was
suggested and discussed in Feb/2024 [3].
[3] https://lore.kernel.org/linux-nvme/20240221132404.6311-1-dwagner@suse.de/
#3: srp/001,002,011,012,013,014,016
The seven test cases in srp test group failed due to the WARN
"kmem_cache of name 'srpt-rsp-buf' already exists" [4]. The failures are
recreated in stable manner. They need further debug effort.
[4]
[ 3833.868986] [ T120648] ------------[ cut here ]------------
[ 3833.870223] [ T120648] kmem_cache of name 'srpt-rsp-buf' already exists
[ 3833.871490] [ T120648] WARNING: CPU: 1 PID: 120648 at mm/slab_common.c:107 __kmem_cache_create_args+0xa3/0x300
[ 3833.873136] [ T120648] Modules linked in: ib_srp scsi_transport_srp target_core_user target_core_pscsi target_core_file ib_srpt target_core_iblock target_core_mod rdma_cm scsi_debug siw ib_uverbs null_blk ib_umad crc32_generic dm_service_time nbd iw_cm ib_cm ib_core pktcdvd nft_fib_inet nft_fib_ipv4 nft_fib_ipv6
nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables qrtr sunrpc 9pnet_virtio 9pnet ppdev netfs pcspkr i2c_piix4 e1000 parport_pc i2c_smbus parport fuse loop nfnetlink zram bochs drm_vram_helper drm_ttm_helper ttm drm
_kms_helper xfs nvme nvme_core drm floppy sym53c8xx scsi_transport_spi nvme_auth serio_raw ata_generic pata_acpi dm_multipath qemu_fw_cfg [last unloaded: null_blk]
[ 3833.882920] [ T120648] CPU: 1 UID: 0 PID: 120648 Comm: kworker/u16:55 Tainted: G B W 6.12.0-rc1 #334
[ 3833.884767] [ T120648] Tainted: [B]=BAD_PAGE, [W]=WARN
[ 3833.886258] [ T120648] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014
[ 3833.887979] [ T120648] Workqueue: iw_cm_wq cm_work_handler [iw_cm]
[ 3833.889520] [ T120648] RIP: 0010:__kmem_cache_create_args+0xa3/0x300
[ 3833.891016] [ T120648] Code: 8d 58 98 48 3d d0 a7 25 99 74 21 48 8b 7b 60 48 89 ee e8 30 cd 06 02 85 c0 75 e0 48 89 ee 48 c7 c7 d0 db b0 98 e8 dd 92 82 ff <0f> 0b be 20 00 00 00 48 89 ef e8 8e cd 06 02 48 85 c0 0f 85 02 02
[ 3833.894873] [ T120648] RSP: 0018:ffff8881788f7508 EFLAGS: 00010292
[ 3833.896546] [ T120648] RAX: 0000000000000000 RBX: ffff888104be3540 RCX: 0000000000000000
[ 3833.898237] [ T120648] RDX: 0000000000000000 RSI: ffffffff981bea60 RDI: 0000000000000001
[ 3833.899973] [ T120648] RBP: ffffffffc1f52c20 R08: 0000000000000001 R09: ffffed102f11ee4b
[ 3833.901715] [ T120648] R10: ffff8881788f725f R11: 00000000001b9378 R12: 0000000000000100
[ 3833.903509] [ T120648] R13: ffff8881788f76c8 R14: 0000000000000000 R15: 0000000000000000
[ 3833.905378] [ T120648] FS: 0000000000000000(0000) GS:ffff8883ae080000(0000) knlGS:0000000000000000
[ 3833.907167] [ T120648] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3833.908972] [ T120648] CR2: 00007fdbbefa1474 CR3: 0000000124b3a000 CR4: 00000000000006f0
[ 3833.910941] [ T120648] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 3833.912807] [ T120648] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 3833.914626] [ T120648] Call Trace:
[ 3833.915994] [ T120648] <TASK>
[ 3833.917398] [ T120648] ? __warn.cold+0x5f/0x1f8
[ 3833.918855] [ T120648] ? __kmem_cache_create_args+0xa3/0x300
[ 3833.920464] [ T120648] ? report_bug+0x1ec/0x390
[ 3833.921945] [ T120648] ? handle_bug+0x58/0x90
[ 3833.923442] [ T120648] ? exc_invalid_op+0x13/0x40
[ 3833.924906] [ T120648] ? asm_exc_invalid_op+0x16/0x20
[ 3833.926457] [ T120648] ? __kmem_cache_create_args+0xa3/0x300
[ 3833.928255] [ T120648] ? __kmem_cache_create_args+0xa3/0x300
[ 3833.929985] [ T120648] srpt_cm_req_recv.cold+0xea0/0x44cc [ib_srpt]
[ 3833.931717] [ T120648] ? vsnprintf+0x38b/0x18f0
[ 3833.933255] [ T120648] ? __pfx_vsnprintf+0x10/0x10
[ 3833.934858] [ T120648] ? xas_start+0x93/0x500
[ 3833.936400] [ T120648] ? __pfx_srpt_cm_req_recv+0x10/0x10 [ib_srpt]
[ 3833.938150] [ T120648] ? snprintf+0xa5/0xe0
[ 3833.939611] [ T120648] ? __pfx_snprintf+0x10/0x10
[ 3833.941121] [ T120648] ? lock_release+0x57a/0x7a0
[ 3833.942652] [ T120648] srpt_rdma_cm_req_recv+0x35d/0x460 [ib_srpt]
[ 3833.944234] [ T120648] ? __pfx_srpt_rdma_cm_req_recv+0x10/0x10 [ib_srpt]
[ 3833.945844] [ T120648] ? rcu_is_watching+0x11/0xb0
[ 3833.947311] [ T120648] ? trace_cm_event_handler+0xf5/0x140 [rdma_cm]
[ 3833.948835] [ T120648] cma_cm_event_handler+0x88/0x210 [rdma_cm]
[ 3833.950302] [ T120648] iw_conn_req_handler+0x7a8/0xf10 [rdma_cm]
[ 3833.951766] [ T120648] ? __pfx_iw_conn_req_handler+0x10/0x10 [rdma_cm]
[ 3833.953252] [ T120648] ? alloc_work_entries+0x12f/0x260 [iw_cm]
[ 3833.954602] [ T120648] cm_work_handler+0x143f/0x1ba0 [iw_cm]
[ 3833.955904] [ T120648] ? __pfx_cm_work_handler+0x10/0x10 [iw_cm]
[ 3833.957213] [ T120648] ? process_one_work+0x7de/0x1460
[ 3833.958412] [ T120648] ? lock_acquire+0x2d/0xc0
[ 3833.959538] [ T120648] ? process_one_work+0x7de/0x1460
[ 3833.960672] [ T120648] process_one_work+0x85a/0x1460
[ 3833.961764] [ T120648] ? __pfx_process_one_work+0x10/0x10
[ 3833.962861] [ T120648] ? assign_work+0x16c/0x240
[ 3833.963901] [ T120648] worker_thread+0x5e2/0xfc0
[ 3833.964926] [ T120648] ? __pfx_worker_thread+0x10/0x10
[ 3833.965983] [ T120648] kthread+0x2d1/0x3a0
[ 3833.966935] [ T120648] ? trace_irq_enable.constprop.0+0xce/0x110
[ 3833.968000] [ T120648] ? __pfx_kthread+0x10/0x10
[ 3833.968956] [ T120648] ret_from_fork+0x30/0x70
[ 3833.969890] [ T120648] ? __pfx_kthread+0x10/0x10
[ 3833.970837] [ T120648] ret_from_fork_asm+0x1a/0x30
[ 3833.971792] [ T120648] </TASK>
[ 3833.972609] [ T120648] irq event stamp: 0
[ 3833.973489] [ T120648] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[ 3833.974605] [ T120648] hardirqs last disabled at (0): [<ffffffff95204727>] copy_process+0x1ef7/0x8480
[ 3833.975860] [ T120648] softirqs last enabled at (0): [<ffffffff9520478c>] copy_process+0x1f5c/0x8480
[ 3833.977096] [ T120648] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 3833.978192] [ T120648] ---[ end trace 0000000000000000 ]---
More information about the Linux-nvme
mailing list