nvme-fc: waiting in invalid context bug
Daniel Wagner
dwagner at suse.de
Wed Aug 23 04:43:00 PDT 2023
While I am working on fc support in blktest I run into bug report below.
If I read this correct ee6fdc5055e9 ("nvme-fc: fix race between error
recovery and creating association") is introducing this bug.
=============================
[ BUG: Invalid wait context ]
6.5.0-rc2+ #16 Tainted: G W
-----------------------------
kworker/u8:5/105 is trying to lock:
ffff8881127d4748 (&ctrl->namespaces_rwsem){++++}-{3:3}, at: nvme_kick_requeue_lists+0x31/0x1d0 [nvme_core]
other info that might help us debug this:
context-{4:4}
3 locks held by kworker/u8:5/105:
#0: ffff8881182cd148 ((wq_completion)nvme-wq){+.+.}-{0:0}, at: process_one_work+0x7a6/0x1180
#1: ffff888110fa7d20 ((work_completion)(&(&ctrl->connect_work)->work)){+.+.}-{0:0}, at: process_one_work+0x7e9/0x1180
#2: ffff8881127d4018 (&ctrl->lock#2){....}-{2:2}, at: nvme_fc_connect_ctrl_work+0x1715/0x1be0 [nvme_fc]
stack backtrace:
CPU: 1 PID: 105 Comm: kworker/u8:5 Tainted: G W 6.5.0-rc2+ #16 4796ef1f1e7efc9e14ac22d8802d2575bb3a3aef
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 2/2/2022
Workqueue: nvme-wq nvme_fc_connect_ctrl_work [nvme_fc]
Call Trace:
<TASK>
dump_stack_lvl+0x5b/0x80
__lock_acquire+0x17e8/0x7e70
? mark_lock+0x94/0x350
? verify_lock_unused+0x150/0x150
? verify_lock_unused+0x150/0x150
? lock_acquire+0x16d/0x410
? process_one_work+0x1180/0x1180
? lock_release+0x2aa/0xd30
? __cfi_lock_release+0x10/0x10
? start_flush_work+0x553/0x610
lock_acquire+0x16d/0x410
? nvme_kick_requeue_lists+0x31/0x1d0 [nvme_core af1437cccf764f8f599077b8e0f169b94f7f9966]
? __cfi_lock_acquire+0x10/0x10
? __wake_up+0x120/0x200
? lock_release+0x2aa/0xd30
? nvme_kick_requeue_lists+0x31/0x1d0 [nvme_core af1437cccf764f8f599077b8e0f169b94f7f9966]
down_read+0xa7/0xa10
? nvme_kick_requeue_lists+0x31/0x1d0 [nvme_core af1437cccf764f8f599077b8e0f169b94f7f9966]
? try_to_grab_pending+0x86/0x480
? __cfi_down_read+0x10/0x10
? __cancel_work_timer+0x3a1/0x480
? _raw_spin_unlock_irqrestore+0x24/0x50
? cancel_work_sync+0x20/0x20
? __cfi_lock_release+0x10/0x10
nvme_kick_requeue_lists+0x31/0x1d0 [nvme_core af1437cccf764f8f599077b8e0f169b94f7f9966]
nvme_change_ctrl_state+0x208/0x2e0 [nvme_core af1437cccf764f8f599077b8e0f169b94f7f9966]
nvme_fc_connect_ctrl_work+0x17a4/0x1be0 [nvme_fc 40247846cbe6ec64af4ae5bef38fd58d34ff3bbd]
process_one_work+0x89c/0x1180
? rescuer_thread+0x1150/0x1150
? do_raw_spin_trylock+0xc9/0x1f0
? lock_acquired+0x310/0x9b0
? worker_thread+0xd5e/0x1260
worker_thread+0x91e/0x1260
kthread+0x25d/0x2f0
? __cfi_worker_thread+0x10/0x10
? __cfi_kthread+0x10/0x10
ret_from_fork+0x41/0x70
? __cfi_kthread+0x10/0x10
ret_from_fork_asm+0x1b/0x30
RIP: 0000:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
More information about the Linux-nvme
mailing list