vector: spinlock ordering problems

Anton Ivanov anton.ivanov at cambridgegreys.com
Wed Jul 3 22:23:22 PDT 2024


On 03/07/2024 17:26, Johannes Berg wrote:
> Lockdep reports a spinlock ordering problem: sometimes we take head_lock
> first, sometimes tail_lock, so there's a classic ABBA deadlock possible.
> It may not happen now because of UML being single-CPU and all that, but
> perhaps with preempt?

You need both locks only when you updating the queue depth. That was not 
an issue without preempt. Looking at the trace this is now an issue, so 
it will need to be sorted out.

The way the locks work is - head_lock guards head, tail_lock guards tail 
and head/tail run around and wrap.

If increment and decrement of the queue_depth is using atomics you do 
not need to grab both locks. One lock + atomic sub/add will suffice.

> 
> Report:
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.10.0-rc6-00036-gcd01672d64a3 #26 Not tainted
> ------------------------------------------------------
> systemd-network/105 is trying to acquire lock:
> 00000000622bd8b0 (&result->tail_lock){+...}-{2:2}, at: vector_send+0x1c5/0x266
> 
> but task is already holding lock:
> 00000000622bd870 (&result->head_lock){+...}-{2:2}, at: vector_send+0x3b/0x266
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (&result->head_lock){+...}-{2:2}:
>         save_stack_trace+0x32/0x34
>         stack_trace_save+0x38/0x3d
>         save_trace+0x90/0x268
>         __lock_acquire+0x146c/0x15a4
>         lock_acquire+0x2e1/0x38d
>         _raw_spin_lock+0x43/0x5a
>         vector_net_start_xmit+0x1ef/0x3bd
>         netdev_start_xmit+0x51/0x80
>         dev_hard_start_xmit+0x137/0x22f
>         sch_direct_xmit+0xc6/0x2a3
>         __dev_queue_xmit+0x3da/0x877
>         packet_xmit+0x3c/0xf2
>         packet_sendmsg+0x106a/0x10d0
>         sock_sendmsg_nosec+0x1c/0x98
>         __sys_sendto+0x113/0x147
>         sys_sendto+0x14/0x18
>         handle_syscall+0xa8/0xd8
>         userspace+0x323/0x4e5
>         fork_handler+0x8c/0x8e
> 
> -> #0 (&result->tail_lock){+...}-{2:2}:
>         save_stack_trace+0x32/0x34
>         stack_trace_save+0x38/0x3d
>         save_trace+0x90/0x268
>         print_circular_bug+0x68/0x2b5
>         check_noncircular+0x11c/0x12c
>         __lock_acquire+0x12a4/0x15a4
>         lock_acquire+0x2e1/0x38d
>         _raw_spin_lock+0x43/0x5a
>         vector_send+0x1c5/0x266
>         vector_net_start_xmit+0x394/0x3bd
>         netdev_start_xmit+0x51/0x80
>         dev_hard_start_xmit+0x137/0x22f
>         sch_direct_xmit+0xc6/0x2a3
>         __dev_queue_xmit+0x3da/0x877
>         packet_xmit+0x3c/0xf2
>         packet_sendmsg+0x106a/0x10d0
>         sock_sendmsg_nosec+0x1c/0x98
>         __sys_sendto+0x113/0x147
>         sys_sendto+0x14/0x18
>         handle_syscall+0xa8/0xd8
>         userspace+0x323/0x4e5
>         fork_handler+0x8c/0x8e
> 
> other info that might help us debug this:
> 
>   Possible unsafe locking scenario:
> 
>         CPU0                    CPU1
>         ----                    ----
>    lock(&result->head_lock);
>                                 lock(&result->tail_lock);
>                                 lock(&result->head_lock);
>    lock(&result->tail_lock);
> 
>   *** DEADLOCK ***
> 
> 4 locks held by systemd-network/105:
>   #0: 0000000060749920 (rcu_read_lock_bh){....}-{1:2}, at: rcu_lock_acquire+0x16/0x30
>   #1: 0000000066a47210 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: qdisc_run_begin+0x2e/0x66
>   #2: 000000006227f690 (_xmit_ETHER#2){+...}-{2:2}, at: sch_direct_xmit+0x9b/0x2a3
>   #3: 00000000622bd870 (&result->head_lock){+...}-{2:2}, at: vector_send+0x3b/0x266
> 
> stack backtrace:
> CPU: 0 PID: 105 Comm: systemd-network Not tainted 6.10.0-rc6-00036-gcd01672d64a3 #26
> Stack:
>   812c7710 604c0e3b 000000ea 00000000
>   ffffff00 6064f6f1 604de3a0 6063beda
>   812c7740 604e8b7c 000000ec 604df56f
> Call Trace:
>   [<604df56f>] ? _printk+0x0/0x98
>   [<6002b386>] show_stack+0x141/0x150
>   [<604c0e3b>] ? dump_stack_print_info+0xe2/0xf1
>   [<604de3a0>] ? __print_lock_name+0x0/0xbd
>   [<604e8b7c>] dump_stack_lvl+0x71/0xb8
>   [<604df56f>] ? _printk+0x0/0x98
>   [<604e8be1>] dump_stack+0x1e/0x20
>   [<6009348a>] print_circular_bug+0x2a6/0x2b5
>   [<600c8e7d>] ? stack_trace_save+0x38/0x3d
>   [<60090fbf>] ? save_trace+0x90/0x268
>   [<600935b5>] check_noncircular+0x11c/0x12c
>   [<600acb42>] ? rcu_read_lock_any_held+0x40/0x9c
>   [<604ea4ec>] ? __this_cpu_preempt_check+0x14/0x16
>   [<60091cc6>] ? lookup_chain_cache+0x4e/0x130
>   [<60090d22>] ? hlock_class+0x0/0xa3
>   [<600963be>] __lock_acquire+0x12a4/0x15a4
>   [<604ea4d8>] ? __this_cpu_preempt_check+0x0/0x16
>   [<60095034>] lock_acquire+0x2e1/0x38d
>   [<60034e49>] ? vector_send+0x1c5/0x266
>   [<604e9cf7>] ? debug_lockdep_rcu_enabled+0x0/0x3f
>   [<604e9cf7>] ? debug_lockdep_rcu_enabled+0x0/0x3f
>   [<604f2532>] _raw_spin_lock+0x43/0x5a
>   [<60034e49>] ? vector_send+0x1c5/0x266
>   [<604f24ef>] ? _raw_spin_lock+0x0/0x5a
>   [<60034e49>] vector_send+0x1c5/0x266
>   [<604f28bb>] ? _raw_spin_unlock+0x0/0x7c
>   [<6003538c>] vector_net_start_xmit+0x394/0x3bd
>   [<604ea4d8>] ? __this_cpu_preempt_check+0x0/0x16
>   [<603d0711>] netdev_start_xmit+0x51/0x80
>   [<600b09cd>] ? rcu_is_watching+0x0/0x6f
>   [<603d79f9>] dev_hard_start_xmit+0x137/0x22f
>   [<6041a35b>] sch_direct_xmit+0xc6/0x2a3
>   [<603d15a7>] ? qdisc_run_end+0x0/0x52
>   [<603d7f6b>] __dev_queue_xmit+0x3da/0x877
>   [<603b3f55>] ? skb_set_owner_w+0x96/0x9b
>   [<604ba1b5>] packet_xmit+0x3c/0xf2
>   [<604bfbc2>] packet_sendmsg+0x106a/0x10d0
>   [<6002e30a>] ? copy_chunk_from_user+0x26/0x30
>   [<6002e2e4>] ? copy_chunk_from_user+0x0/0x30
>   [<6002e592>] ? do_op_one_page+0x9d/0xa9
>   [<60150008>] ? __gup_longterm_locked+0x39b/0x618
>   [<603af0f0>] sock_sendmsg_nosec+0x1c/0x98
>   [<603b13ca>] __sys_sendto+0x113/0x147
>   [<600f8264>] ? __seccomp_filter+0xa9/0x3c7
>   [<6004682f>] ? switch_threads+0x28/0x57
>   [<603b1412>] sys_sendto+0x14/0x18
>   [<6002e252>] handle_syscall+0xa8/0xd8
>   [<60046631>] userspace+0x323/0x4e5
>   [<6002a494>] ? interrupt_end+0x0/0xe4
>   [<6002a3c3>] fork_handler+0x8c/0x8e
> 
> 
> 

-- 
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/




More information about the linux-um mailing list