BUG: Circular locking dependency on netdev led trigger on NanoPi R5S
Diederik de Haas
didi.debian at cknow.org
Fri Jul 25 10:48:03 PDT 2025
Hi,
I have a FriendlyELEC NanoPi R5S (with rk3568 SoC) and in commit
1631cbdb8089 ("arm64: dts: rockchip: Improve LED config for NanoPi R5S")
I tried to improve its LED configuration and that included
``linux,default-trigger = "netdev"``
Problem: sometimes I got a 'hung task' error which resulted in the WAN
port not to come up (that's the only one I use) and logging in via
serial also didn't work, so pulling the plug was the only remedy.
Robin Murphy quickly identified that it likely had to do with led
triggers and removing those netdev triggers made the problem go away[1].
To find out what actually caused it, I built a kernel with PROOF_LOCKING
and PRINTK_CALLER enabled, which after adding a patch which fixed an
OOPS [2], showed the underlaying problem:
======================================================
WARNING: possible circular locking dependency detected
6.16-rc7+unreleased-arm64-cknow #1 Not tainted
------------------------------------------------------
modprobe/936 is trying to acquire lock:
ffffc943e0edc3b0 (pernet_ops_rwsem){++++}-{4:4}, at: register_netdevice_notifier+0x38/0x148
but task is already holding lock:
ffff0001f2762248 (&led_cdev->trigger_lock){+.+.}-{4:4}, at: led_trigger_register+0x14c/0x1e0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (&led_cdev->trigger_lock){+.+.}-{4:4}:
lock_acquire+0x1cc/0x348
down_write+0x40/0xd8
led_trigger_set_default+0x5c/0x170
led_classdev_register_ext+0x340/0x488
__sdhci_add_host+0x190/0x368 [sdhci]
dwcmshc_probe+0x2b8/0x6b0 [sdhci_of_dwcmshc]
platform_probe+0x70/0xe8
really_probe+0xc8/0x3a0
__driver_probe_device+0x84/0x160
driver_probe_device+0x44/0x128
__device_attach_driver+0xc4/0x170
bus_for_each_drv+0x90/0xf8
__device_attach_async_helper+0xc0/0x120
async_run_entry_fn+0x40/0x180
process_one_work+0x23c/0x640
worker_thread+0x1b4/0x360
kthread+0x150/0x250
ret_from_fork+0x10/0x20
-> #2 (triggers_list_lock){++++}-{4:4}:
lock_acquire+0x1cc/0x348
down_write+0x40/0xd8
led_trigger_register+0x58/0x1e0
phy_led_triggers_register+0xf4/0x258 [libphy]
phy_attach_direct+0x328/0x3a8 [libphy]
phylink_fwnode_phy_connect+0xb0/0x138 [phylink]
__stmmac_open+0xec/0x520 [stmmac]
stmmac_open+0x4c/0xe8 [stmmac]
__dev_open+0x13c/0x310
__dev_change_flags+0x1d4/0x260
netif_change_flags+0x2c/0x80
dev_change_flags+0x90/0xd0
devinet_ioctl+0x55c/0x730
inet_ioctl+0x1e4/0x200
sock_do_ioctl+0x6c/0x140
sock_ioctl+0x328/0x3c0
__arm64_sys_ioctl+0xb4/0x118
invoke_syscall+0x6c/0x100
el0_svc_common.constprop.0+0x48/0xf0
do_el0_svc+0x24/0x38
el0_svc+0x54/0x1e0
el0t_64_sync_handler+0x10c/0x140
el0t_64_sync+0x198/0x1a0
-> #1 (rtnl_mutex){+.+.}-{4:4}:
lock_acquire+0x1cc/0x348
__mutex_lock+0xac/0x590
mutex_lock_nested+0x2c/0x40
rtnl_lock+0x24/0x38
register_netdevice_notifier+0x40/0x148
rtnetlink_init+0x40/0x68
netlink_proto_init+0x120/0x158
do_one_initcall+0x88/0x3b8
kernel_init_freeable+0x2d0/0x340
kernel_init+0x28/0x160
ret_from_fork+0x10/0x20
-> #0 (pernet_ops_rwsem){++++}-{4:4}:
check_prev_add+0x114/0xcb8
__lock_acquire+0x12e8/0x15f0
lock_acquire+0x1cc/0x348
down_write+0x40/0xd8
register_netdevice_notifier+0x38/0x148
netdev_trig_activate+0x18c/0x1e8 [ledtrig_netdev]
led_trigger_set+0x1d4/0x328
led_trigger_register+0x194/0x1e0
netdev_led_trigger_init+0x20/0xff8 [ledtrig_netdev]
do_one_initcall+0x88/0x3b8
do_init_module+0x5c/0x270
load_module+0x1ed8/0x2608
init_module_from_file+0x94/0x100
idempotent_init_module+0x1e8/0x2f0
__arm64_sys_finit_module+0x70/0xe8
invoke_syscall+0x6c/0x100
el0_svc_common.constprop.0+0x48/0xf0
do_el0_svc+0x24/0x38
el0_svc+0x54/0x1e0
el0t_64_sync_handler+0x10c/0x140
el0t_64_sync+0x198/0x1a0
other info that might help us debug this:
Chain exists of:
pernet_ops_rwsem --> triggers_list_lock --> &led_cdev->trigger_lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&led_cdev->trigger_lock);
lock(triggers_list_lock);
lock(&led_cdev->trigger_lock);
lock(pernet_ops_rwsem);
*** DEADLOCK ***
2 locks held by modprobe/936:
#0: ffffc943e0d2baa8 (leds_list_lock){++++}-{4:4}, at: led_trigger_register+0x10c/0x1e0
#1: ffff0001f2762248 (&led_cdev->trigger_lock){+.+.}-{4:4}, at: led_trigger_register+0x14c/0x1e0
stack backtrace:
CPU: 0 UID: 0 PID: 936 Comm: modprobe Not tainted 6.16-rc7+unreleased-arm64-cknow #1 PREEMPTLAZY Debian 6.16~rc7-2~exp1
Hardware name: FriendlyElec NanoPi R5S (DT)
Call trace:
show_stack+0x34/0xa0 (C)
dump_stack_lvl+0x70/0x98
dump_stack+0x18/0x24
print_circular_bug+0x230/0x280
check_noncircular+0x174/0x188
check_prev_add+0x114/0xcb8
__lock_acquire+0x12e8/0x15f0
lock_acquire+0x1cc/0x348
down_write+0x40/0xd8
register_netdevice_notifier+0x38/0x148
netdev_trig_activate+0x18c/0x1e8 [ledtrig_netdev]
led_trigger_set+0x1d4/0x328
led_trigger_register+0x194/0x1e0
netdev_led_trigger_init+0x20/0xff8 [ledtrig_netdev]
do_one_initcall+0x88/0x3b8
do_init_module+0x5c/0x270
load_module+0x1ed8/0x2608
init_module_from_file+0x94/0x100
idempotent_init_module+0x1e8/0x2f0
__arm64_sys_finit_module+0x70/0xe8
invoke_syscall+0x6c/0x100
el0_svc_common.constprop.0+0x48/0xf0
do_el0_svc+0x24/0x38
el0_svc+0x54/0x1e0
el0t_64_sync_handler+0x10c/0x140
el0t_64_sync+0x198/0x1a0
leds-gpio gpio-leds: bus: 'platform': really_probe: bound device to driver leds-gpio
Full serial log can be found at [3] which is quite verbose and the boot
took way longer then normal as the following was added to cmdline:
``dyndbg="file dd.c func really_probe +p" maxcpus=1``
Free free to ask for additional info and/or to run tests.
Cheers,
Diederik
[1] https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git/commit/?h=arm/fixes&id=912b1f2a796ec73530a709b11821cb0c249fb23e
[2] https://lore.kernel.org/linux-rockchip/f81b88df-9959-4968-a60a-b7efd3d5ea24@arm.com/
[3] https://paste.sr.ht/~diederik/142e92bfb29bbb58bca18a74cdffc5e0ba79081c
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-rockchip/attachments/20250725/9da44d71/attachment-0001.sig>
More information about the Linux-rockchip
mailing list