nvme-fc: waiting in invalid context bug
Daniel Wagner
dwagner at suse.de
Wed Aug 23 04:59:39 PDT 2023
On Wed, Aug 23, 2023 at 01:43:00PM +0200, Daniel Wagner wrote:
> While I am working on fc support in blktest I run into bug report below.
>
> If I read this correct ee6fdc5055e9 ("nvme-fc: fix race between error
> recovery and creating association") is introducing this bug.
Reverting this commit, makes the report go away.
> =============================
> [ BUG: Invalid wait context ]
> 6.5.0-rc2+ #16 Tainted: G W
> -----------------------------
> kworker/u8:5/105 is trying to lock:
> ffff8881127d4748 (&ctrl->namespaces_rwsem){++++}-{3:3}, at: nvme_kick_requeue_lists+0x31/0x1d0 [nvme_core]
> other info that might help us debug this:
> context-{4:4}
> 3 locks held by kworker/u8:5/105:
> #0: ffff8881182cd148 ((wq_completion)nvme-wq){+.+.}-{0:0}, at: process_one_work+0x7a6/0x1180
> #1: ffff888110fa7d20 ((work_completion)(&(&ctrl->connect_work)->work)){+.+.}-{0:0}, at: process_one_work+0x7e9/0x1180
> #2: ffff8881127d4018 (&ctrl->lock#2){....}-{2:2}, at: nvme_fc_connect_ctrl_work+0x1715/0x1be0 [nvme_fc]
> stack backtrace:
> CPU: 1 PID: 105 Comm: kworker/u8:5 Tainted: G W 6.5.0-rc2+ #16 4796ef1f1e7efc9e14ac22d8802d2575bb3a3aef
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 2/2/2022
> Workqueue: nvme-wq nvme_fc_connect_ctrl_work [nvme_fc]
> Call Trace:
> <TASK>
> dump_stack_lvl+0x5b/0x80
> __lock_acquire+0x17e8/0x7e70
> ? mark_lock+0x94/0x350
> ? verify_lock_unused+0x150/0x150
> ? verify_lock_unused+0x150/0x150
> ? lock_acquire+0x16d/0x410
> ? process_one_work+0x1180/0x1180
> ? lock_release+0x2aa/0xd30
> ? __cfi_lock_release+0x10/0x10
> ? start_flush_work+0x553/0x610
> lock_acquire+0x16d/0x410
> ? nvme_kick_requeue_lists+0x31/0x1d0 [nvme_core af1437cccf764f8f599077b8e0f169b94f7f9966]
> ? __cfi_lock_acquire+0x10/0x10
> ? __wake_up+0x120/0x200
> ? lock_release+0x2aa/0xd30
> ? nvme_kick_requeue_lists+0x31/0x1d0 [nvme_core af1437cccf764f8f599077b8e0f169b94f7f9966]
> down_read+0xa7/0xa10
> ? nvme_kick_requeue_lists+0x31/0x1d0 [nvme_core af1437cccf764f8f599077b8e0f169b94f7f9966]
> ? try_to_grab_pending+0x86/0x480
> ? __cfi_down_read+0x10/0x10
> ? __cancel_work_timer+0x3a1/0x480
> ? _raw_spin_unlock_irqrestore+0x24/0x50
> ? cancel_work_sync+0x20/0x20
> ? __cfi_lock_release+0x10/0x10
> nvme_kick_requeue_lists+0x31/0x1d0 [nvme_core af1437cccf764f8f599077b8e0f169b94f7f9966]
> nvme_change_ctrl_state+0x208/0x2e0 [nvme_core af1437cccf764f8f599077b8e0f169b94f7f9966]
> nvme_fc_connect_ctrl_work+0x17a4/0x1be0 [nvme_fc 40247846cbe6ec64af4ae5bef38fd58d34ff3bbd]
> process_one_work+0x89c/0x1180
> ? rescuer_thread+0x1150/0x1150
> ? do_raw_spin_trylock+0xc9/0x1f0
> ? lock_acquired+0x310/0x9b0
> ? worker_thread+0xd5e/0x1260
> worker_thread+0x91e/0x1260
> kthread+0x25d/0x2f0
> ? __cfi_worker_thread+0x10/0x10
> ? __cfi_kthread+0x10/0x10
> ret_from_fork+0x41/0x70
> ? __cfi_kthread+0x10/0x10
> ret_from_fork_asm+0x1b/0x30
> RIP: 0000:0x0
> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> </TASK>
>
More information about the Linux-nvme
mailing list