bad unlock balance WARNING at nvme/045
Shinichiro Kawasaki
shinichiro.kawasaki at wdc.com
Tue Oct 18 01:03:51 PDT 2022
Hello Hannes,
I observed "WARNING: bad unlock balance detected!" at nvme/045 [1]. As the Call
Trace shows, nvme_auth_reset() has unbalanced mutex lock/unlock.
mutex_lock(&ctrl->dhchap_auth_mutex);
list_for_each_entry(chap, &ctrl->dhchap_auth_list, entry) {
mutex_unlock(&ctrl->dhchap_auth_mutex);
flush_work(&chap->auth_work);
__nvme_auth_reset(chap);
}
mutex_unlock(&ctrl->dhchap_auth_mutex);
I tried to remove the mutex_unlock in the list iteration with a patch [2], but
it resulted in another "WARNING: possible recursive locking detected" [3]. I'm
not sure but cause of this WARN could be __nvme_auth_work and
nvme_dhchap_auth_work in same nvme_wq.
Could you take a look for fix?
[1]
[ 89.883480] loop: module loaded
[ 90.271575] run blktests nvme/045 at 2022-10-18 15:13:33
[ 90.774324] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[ 91.410877] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid: with DH-HMAC-CHAP.
[ 91.454315] nvme nvme6: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048
[ 91.456445] nvme nvme6: qid 0: controller authenticated
[ 91.458581] nvme nvme6: qid 0: authenticated
[ 91.466122] nvme nvme6: Please enable CONFIG_NVME_MULTIPATH for full support of multi-port devices.
[ 91.469709] nvme nvme6: creating 4 I/O queues.
[ 91.620115] nvme nvme6: new ctrl: "blktests-subsystem-1"
[ 93.775756] =====================================
[ 93.776961] WARNING: bad unlock balance detected!
[ 93.778088] 6.1.0-rc1 #3 Not tainted
[ 93.779011] -------------------------------------
[ 93.780090] check/961 is trying to release lock (&ctrl->dhchap_auth_mutex) at:
[ 93.781726] [<ffffffffc03bf28b>] nvme_auth_reset+0x5b/0xb0 [nvme_core]
[ 93.783310] but there are no more locks to release!
[ 93.784461]
other info that might help us debug this:
[ 93.786087] 3 locks held by check/961:
[ 93.787098] #0: ffff8881135fa460 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0xe7/0x1b0
[ 93.788923] #1: ffff88811fb1d888 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x21d/0x530
[ 93.790888] #2: ffff88810d8160f0 (kn->active#110){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x241/0x530
[ 93.792960]
stack backtrace:
[ 93.794168] CPU: 1 PID: 961 Comm: check Not tainted 6.1.0-rc1 #3
[ 93.795552] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 93.798016] Call Trace:
[ 93.798716] <TASK>
[ 93.799298] dump_stack_lvl+0x5b/0x77
[ 93.800251] lock_release.cold+0x10/0x4e
[ 93.801244] ? nvme_auth_reset+0x5b/0xb0 [nvme_core]
[ 93.802479] ? reacquire_held_locks+0x4e0/0x4e0
[ 93.803537] ? mark_held_locks+0x9e/0xe0
[ 93.804544] __mutex_unlock_slowpath+0x8c/0x5f0
[ 93.805650] ? kasan_quarantine_put+0x94/0x1f0
[ 93.806759] ? bit_wait_timeout+0x170/0x170
[ 93.807757] ? __nvme_auth_reset+0x198/0x3d0 [nvme_core]
[ 93.809010] ? __kmem_cache_free+0xa9/0x390
[ 93.810059] nvme_auth_reset+0x5b/0xb0 [nvme_core]
[ 93.811253] nvme_ctrl_dhchap_secret_store+0x1b4/0x1d0 [nvme_core]
[ 93.812710] kernfs_fop_write_iter+0x356/0x530
[ 93.813754] vfs_write+0x519/0xc50
[ 93.814513] ? kernel_write+0x590/0x590
[ 93.815410] ? __up_read+0x182/0x700
[ 93.816252] ? __fget_light+0x51/0x230
[ 93.817092] ksys_write+0xe7/0x1b0
[ 93.817888] ? __ia32_sys_read+0xa0/0xa0
[ 93.818750] ? lockdep_hardirqs_on_prepare+0x17b/0x410
[ 93.819901] ? syscall_enter_from_user_mode+0x22/0xc0
[ 93.820929] ? lockdep_hardirqs_on+0x7d/0x100
[ 93.821848] do_syscall_64+0x37/0x90
[ 93.822583] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 93.823615] RIP: 0033:0x7f6f84d018f7
[ 93.824372] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[ 93.827699] RSP: 002b:00007ffcdb0ce5d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 93.829137] RAX: ffffffffffffffda RBX: 000000000000003c RCX: 00007f6f84d018f7
[ 93.830502] RDX: 000000000000003c RSI: 00005582004ce6e0 RDI: 0000000000000001
[ 93.831879] RBP: 00005582004ce6e0 R08: 0000000000000000 R09: 0000000000000073
[ 93.833236] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000003c
[ 93.834594] R13: 00007f6f84df9780 R14: 000000000000003c R15: 00007f6f84df49e0
[ 93.835979] </TASK>
[ 93.836823] nvme nvme6: re-authenticating controller
[ 93.871885] nvme nvme6: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048
[ 93.873754] nvme nvme6: qid 0: controller authenticated
[ 93.916346] nvme nvme6: re-authenticating controller
[ 93.948156] nvme nvme6: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048
[ 93.949498] nvme nvme6: qid 0: controller authenticated
[ 93.992725] nvme nvme6: re-authenticating controller
[ 94.025843] nvme nvme6: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048
[ 94.027160] nvme nvme6: qid 0: controller authenticated
[ 98.836070] nvme nvme6: re-authenticating controller
[ 99.345893] nvme nvme6: qid 0: authenticated with hash hmac(sha512) dhgroup ffdhe8192
[ 99.347859] nvme nvme6: qid 0: controller authenticated
[ 99.736321] nvme nvme6: re-authenticating controller
[ 100.014469] nvme nvme6: qid 0: authenticated with hash hmac(sha512) dhgroup ffdhe8192
[ 100.015192] nvme nvme6: qid 0: controller authenticated
[ 101.560718] nvme nvme6: Removing ctrl: NQN "blktests-subsystem-1"
[2]
diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index c8a6db7c4498..4e824aab30eb 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -926,7 +926,6 @@ void nvme_auth_reset(struct nvme_ctrl *ctrl)
mutex_lock(&ctrl->dhchap_auth_mutex);
list_for_each_entry(chap, &ctrl->dhchap_auth_list, entry) {
- mutex_unlock(&ctrl->dhchap_auth_mutex);
flush_work(&chap->auth_work);
__nvme_auth_reset(chap);
}
[3]
[ 2678.264545] loop: module loaded
[ 2678.679919] run blktests nvme/045 at 2022-10-18 16:34:00
[ 2679.231157] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[ 2679.854789] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid: with DH-HMAC-CHAP.
[ 2679.898208] nvme nvme6: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048
[ 2679.900508] nvme nvme6: qid 0: controller authenticated
[ 2679.902183] nvme nvme6: qid 0: authenticated
[ 2679.907944] nvme nvme6: Please enable CONFIG_NVME_MULTIPATH for full support of multi-port devices.
[ 2679.912737] nvme nvme6: creating 4 I/O queues.
[ 2680.068539] nvme nvme6: new ctrl: "blktests-subsystem-1"
[ 2682.156749] nvme nvme6: re-authenticating controller
[ 2682.160640] ============================================
[ 2682.161903] WARNING: possible recursive locking detected
[ 2682.163112] 6.1.0-rc1+ #4 Not tainted
[ 2682.164063] --------------------------------------------
[ 2682.165319] kworker/u8:0/927 is trying to acquire lock:
[ 2682.166573] ffff888119d27138 ((wq_completion)nvme-wq){+.+.}-{0:0}, at: __flush_work+0x40e/0x900
[ 2682.168529]
but task is already holding lock:
[ 2682.170022] ffff888119d27138 ((wq_completion)nvme-wq){+.+.}-{0:0}, at: process_one_work+0x73c/0x1300
[ 2682.172088]
other info that might help us debug this:
[ 2682.173695] Possible unsafe locking scenario:
[ 2682.175196] CPU0
[ 2682.175908] ----
[ 2682.176559] lock((wq_completion)nvme-wq);
[ 2682.177589] lock((wq_completion)nvme-wq);
[ 2682.178631]
*** DEADLOCK ***
[ 2682.180252] May be due to missing lock nesting notation
[ 2682.181918] 3 locks held by kworker/u8:0/927:
[ 2682.182988] #0: ffff888119d27138 ((wq_completion)nvme-wq){+.+.}-{0:0}, at: process_one_work+0x73c/0x1300
[ 2682.185078] #1: ffff88810c8bfdd0 ((work_completion)(&ctrl->dhchap_auth_work)){+.+.}-{0:0}, at: process_one_work+0x769/0x1300
[ 2682.187535] #2: ffffffffb483d560 (rcu_read_lock){....}-{1:2}, at: __flush_work+0xc2/0x900
[ 2682.189373]
stack backtrace:
[ 2682.190590] CPU: 2 PID: 927 Comm: kworker/u8:0 Not tainted 6.1.0-rc1+ #4
[ 2682.192118] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 2682.194630] Workqueue: nvme-wq nvme_dhchap_auth_work [nvme_core]
[ 2682.196108] Call Trace:
[ 2682.196755] <TASK>
[ 2682.197290] dump_stack_lvl+0x5b/0x77
[ 2682.198143] __lock_acquire.cold+0x36f/0x3f5
[ 2682.198510] nvme nvme6: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048
[ 2682.199075] ? lock_chain_count+0x20/0x20
[ 2682.199089] ? lockdep_hardirqs_on_prepare+0x410/0x410
[ 2682.200949] nvme nvme6: qid 0: controller authenticated
[ 2682.201771] lock_acquire+0x194/0x4f0
[ 2682.201783] ? __flush_work+0x40e/0x900
[ 2682.201796] ? lock_downgrade+0x6b0/0x6b0
[ 2682.201808] ? mark_held_locks+0x9e/0xe0
[ 2682.207078] ? lockdep_hardirqs_on_prepare+0x17b/0x410
[ 2682.208041] __flush_work+0x42e/0x900
[ 2682.208736] ? __flush_work+0x40e/0x900
[ 2682.209507] ? queue_delayed_work_on+0x90/0x90
[ 2682.210322] ? flush_workqueue_prep_pwqs+0x3f0/0x3f0
[ 2682.211209] nvme_dhchap_auth_work+0xf1/0x1f8 [nvme_core]
[ 2682.212163] process_one_work+0x816/0x1300
[ 2682.212858] ? lock_downgrade+0x6b0/0x6b0
[ 2682.213589] ? pwq_dec_nr_in_flight+0x230/0x230
[ 2682.214367] ? rwlock_bug.part.0+0x90/0x90
[ 2682.215062] worker_thread+0xfc/0x1270
[ 2682.215670] ? __kthread_parkme+0xc1/0x1f0
[ 2682.216367] ? process_one_work+0x1300/0x1300
[ 2682.217052] kthread+0x29b/0x340
[ 2682.217620] ? kthread_complete_and_exit+0x20/0x20
[ 2682.218313] ret_from_fork+0x1f/0x30
[ 2682.218922] </TASK>
[ 2682.254321] nvme nvme6: re-authenticating controller
[ 2682.292068] nvme nvme6: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048
[ 2682.293937] nvme nvme6: qid 0: controller authenticated
[ 2682.334786] nvme nvme6: re-authenticating controller
[ 2682.364032] nvme nvme6: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048
[ 2682.365898] nvme nvme6: qid 0: controller authenticated
[ 2686.934974] nvme nvme6: re-authenticating controller
[ 2687.345435] nvme nvme6: qid 0: authenticated with hash hmac(sha512) dhgroup ffdhe8192
[ 2687.347055] nvme nvme6: qid 0: controller authenticated
[ 2687.738816] nvme nvme6: re-authenticating controller
[ 2688.133101] nvme nvme6: qid 0: authenticated with hash hmac(sha512) dhgroup ffdhe8192
[ 2688.135123] nvme nvme6: qid 0: controller authenticated
[ 2689.624308] nvme nvme6: Removing ctrl: NQN "blktests-subsystem-1"
[ 2886.071612] loop: module loaded
[ 2886.269473] run blktests nvme/045 at 2022-10-18 16:37:28
[ 2886.533625] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[ 2886.726637] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid: with DH-HMAC-CHAP.
[ 2886.763826] nvme nvme6: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048
[ 2886.765778] nvme nvme6: qid 0: controller authenticated
[ 2886.767199] nvme nvme6: qid 0: authenticated
[ 2886.769632] nvme nvme6: Please enable CONFIG_NVME_MULTIPATH for full support of multi-port devices.
[ 2886.772454] nvme nvme6: creating 4 I/O queues.
[ 2886.918047] nvme nvme6: new ctrl: "blktests-subsystem-1"
[ 2888.496721] nvme nvme6: re-authenticating controller
[ 2888.532075] nvme nvme6: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048
[ 2888.534031] nvme nvme6: qid 0: controller authenticated
[ 2888.583146] nvme nvme6: re-authenticating controller
[ 2888.615654] nvme nvme6: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048
[ 2888.617588] nvme nvme6: qid 0: controller authenticated
[ 2888.656311] nvme nvme6: re-authenticating controller
[ 2888.683953] nvme nvme6: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048
[ 2888.685868] nvme nvme6: qid 0: controller authenticated
[ 2888.724955] nvme nvme6: re-authenticating controller
[ 2889.200478] nvme nvme6: qid 0: authenticated with hash hmac(sha512) dhgroup ffdhe8192
[ 2889.202425] nvme nvme6: qid 0: controller authenticated
[ 2889.625949] nvme nvme6: re-authenticating controller
[ 2889.992842] nvme nvme6: qid 0: authenticated with hash hmac(sha512) dhgroup ffdhe8192
[ 2889.994815] nvme nvme6: qid 0: controller authenticated
[ 2891.514773] nvme nvme6: Removing ctrl: NQN "blktests-subsystem-1"
--
Shin'ichiro Kawasaki
More information about the Linux-nvme
mailing list