[PATCH 2/2] nvmet-tcp: Fix incorrect locking in state_change sk callback
Yi Zhang
yi.zhang at redhat.com
Wed Mar 24 02:06:03 GMT 2021
Hi Sagi
With the two patch, I reproduced another lock dependency issue, here is
the full log:
[ 143.310362] run blktests nvme/003 at 2021-03-23 21:52:15
[ 143.927284] loop: module loaded
[ 144.027532] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[ 144.059070] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
[ 144.201559] nvmet: creating controller 1 for subsystem
nqn.2014-08.org.nvmexpress.discovery for NQN
nqn.2014-08.org.nvmexpress:uuid:e25db33098f14032b70b755db1976647.
[ 144.211644] nvme nvme1: new ctrl: NQN
"nqn.2014-08.org.nvmexpress.discovery", addr 127.0.0.1:4420
[ 154.400575] nvme nvme1: Removing ctrl: NQN
"nqn.2014-08.org.nvmexpress.discovery"
[ 154.407970] ======================================================
[ 154.414871] WARNING: possible circular locking dependency detected
[ 154.421765] 5.12.0-rc3.fix+ #2 Not tainted
[ 154.426340] ------------------------------------------------------
[ 154.433232] kworker/7:2/260 is trying to acquire lock:
[ 154.438972] ffff888288e92030
((work_completion)(&queue->io_work)){+.+.}-{0:0}, at:
__flush_work+0x118/0x1a0
[ 154.449882]
but task is already holding lock:
[ 154.456395] ffffc90002b57db0
((work_completion)(&queue->release_work)){+.+.}-{0:0}, at:
process_one_work+0x7c1/0x1480
[ 154.468263]
which lock already depends on the new lock.
[ 154.477393]
the existing dependency chain (in reverse order) is:
[ 154.485739]
-> #2 ((work_completion)(&queue->release_work)){+.+.}-{0:0}:
[ 154.494884] __lock_acquire+0xb77/0x18d0
[ 154.499853] lock_acquire+0x1ca/0x480
[ 154.504528] process_one_work+0x813/0x1480
[ 154.509688] worker_thread+0x590/0xf80
[ 154.514458] kthread+0x368/0x440
[ 154.518650] ret_from_fork+0x22/0x30
[ 154.523232]
-> #1 ((wq_completion)events){+.+.}-{0:0}:
[ 154.530633] __lock_acquire+0xb77/0x18d0
[ 154.535597] lock_acquire+0x1ca/0x480
[ 154.540272] flush_workqueue+0x101/0x1250
[ 154.545334] nvmet_tcp_install_queue+0x22c/0x2a0 [nvmet_tcp]
[ 154.552242] nvmet_install_queue+0x2a3/0x360 [nvmet]
[ 154.558387] nvmet_execute_admin_connect+0x321/0x420 [nvmet]
[ 154.565305] nvmet_tcp_io_work+0xa04/0xcfb [nvmet_tcp]
[ 154.571629] process_one_work+0x8b2/0x1480
[ 154.576787] worker_thread+0x590/0xf80
[ 154.581560] kthread+0x368/0x440
[ 154.585749] ret_from_fork+0x22/0x30
[ 154.590328]
-> #0 ((work_completion)(&queue->io_work)){+.+.}-{0:0}:
[ 154.598989] check_prev_add+0x15e/0x20f0
[ 154.603953] validate_chain+0xec9/0x19c0
[ 154.608918] __lock_acquire+0xb77/0x18d0
[ 154.613883] lock_acquire+0x1ca/0x480
[ 154.618556] __flush_work+0x139/0x1a0
[ 154.623229] nvmet_tcp_release_queue_work+0x2e5/0xcb0 [nvmet_tcp]
[ 154.630621] process_one_work+0x8b2/0x1480
[ 154.635780] worker_thread+0x590/0xf80
[ 154.640549] kthread+0x368/0x440
[ 154.644741] ret_from_fork+0x22/0x30
[ 154.649321]
other info that might help us debug this:
[ 154.658257] Chain exists of:
(work_completion)(&queue->io_work) -->
(wq_completion)events --> (work_completion)(&queue->release_work)
[ 154.675070] Possible unsafe locking scenario:
[ 154.681679] CPU0 CPU1
[ 154.686728] ---- ----
[ 154.691776] lock((work_completion)(&queue->release_work));
[ 154.698102] lock((wq_completion)events);
[ 154.705493] lock((work_completion)(&queue->release_work));
[ 154.714631] lock((work_completion)(&queue->io_work));
[ 154.720470]
*** DEADLOCK ***
[ 154.727080] 2 locks held by kworker/7:2/260:
[ 154.731849] #0: ffff888100053148
((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x792/0x1480
[ 154.742458] #1: ffffc90002b57db0
((work_completion)(&queue->release_work)){+.+.}-{0:0}, at:
process_one_work+0x7c1/0x1480
[ 154.754809]
stack backtrace:
[ 154.759674] CPU: 7 PID: 260 Comm: kworker/7:2 Not tainted
5.12.0-rc3.fix+ #2
[ 154.767549] Hardware name: Dell Inc. PowerEdge
R730xd/\xc9\xb2\xdePow, BIOS 2.12.1 12/04/2020
[ 154.776197] Workqueue: events nvmet_tcp_release_queue_work [nvmet_tcp]
[ 154.783497] Call Trace:
[ 154.786231] dump_stack+0x93/0xc2
[ 154.789942] check_noncircular+0x26a/0x310
[ 154.794521] ? print_circular_bug+0x460/0x460
[ 154.799391] ? deref_stack_reg+0x170/0x170
[ 154.803967] ? alloc_chain_hlocks+0x1de/0x520
[ 154.808843] check_prev_add+0x15e/0x20f0
[ 154.813231] validate_chain+0xec9/0x19c0
[ 154.817611] ? check_prev_add+0x20f0/0x20f0
[ 154.822286] ? save_trace+0x88/0x5e0
[ 154.826290] __lock_acquire+0xb77/0x18d0
[ 154.830682] lock_acquire+0x1ca/0x480
[ 154.834775] ? __flush_work+0x118/0x1a0
[ 154.839066] ? rcu_read_unlock+0x40/0x40
[ 154.843455] ? __lock_acquire+0xb77/0x18d0
[ 154.848036] __flush_work+0x139/0x1a0
[ 154.852120] ? __flush_work+0x118/0x1a0
[ 154.856409] ? start_flush_work+0x810/0x810
[ 154.861084] ? mark_lock+0xd3/0x1470
[ 154.865082] ? mark_lock_irq+0x1d10/0x1d10
[ 154.869662] ? lock_downgrade+0x100/0x100
[ 154.874147] ? mark_held_locks+0xa5/0xe0
[ 154.878522] ? sk_stream_wait_memory+0xe40/0xe40
[ 154.883686] ? lockdep_hardirqs_on_prepare.part.0+0x198/0x340
[ 154.890394] ? __local_bh_enable_ip+0xa2/0x100
[ 154.895358] ? trace_hardirqs_on+0x1c/0x160
[ 154.900034] ? sk_stream_wait_memory+0xe40/0xe40
[ 154.905192] nvmet_tcp_release_queue_work+0x2e5/0xcb0 [nvmet_tcp]
[ 154.911999] ? lock_is_held_type+0x9a/0x110
[ 154.916676] process_one_work+0x8b2/0x1480
[ 154.921255] ? pwq_dec_nr_in_flight+0x260/0x260
[ 154.926315] ? __lock_contended+0x910/0x910
[ 154.930990] ? worker_thread+0x150/0xf80
[ 154.935374] worker_thread+0x590/0xf80
[ 154.939564] ? __kthread_parkme+0xcb/0x1b0
[ 154.944140] ? process_one_work+0x1480/0x1480
[ 154.949007] kthread+0x368/0x440
[ 154.952615] ? _raw_spin_unlock_irq+0x24/0x30
[ 154.957482] ? __kthread_bind_mask+0x90/0x90
[ 154.962255] ret_from_fork+0x22/0x30
On 3/21/21 3:08 PM, Sagi Grimberg wrote:
> We are not changing anything in the TCP connection state so
> we should not take a write_lock but rather a read lock.
>
> This caused a deadlock when running nvmet-tcp and nvme-tcp
> on the same system, where state_change callbacks on the
> host and on the controller side have causal relationship
> and made lockdep report on this with blktests:
More information about the Linux-nvme
mailing list