[BUG] v4.11-rc1: CPUFREQ Circular locking dependency
Rafael J. Wysocki
rafael at kernel.org
Fri Mar 10 09:42:32 PST 2017
On Fri, Mar 10, 2017 at 4:02 PM, Russell King - ARM Linux
<linux at armlinux.org.uk> wrote:
> ======================================================
> [ INFO: possible circular locking dependency detected ]
> 4.11.0-rc1+ #2121 Not tainted
> -------------------------------------------------------
> ondemand/1005 is trying to acquire lock:
> (cooling_list_lock){+.+...}, at: [<c052d074>] cpufreq_thermal_notifier+0x2c/0xcc
> but task is already holding lock:
> ((cpufreq_policy_notifier_list).rwsem){++++..}, at: [<c0058ff8>] __blocking_notifier_call_chain+0x34/0x68
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
> -> #1 ((cpufreq_policy_notifier_list).rwsem){++++..}:
> down_write+0x44/0x98
> blocking_notifier_chain_register+0x28/0xd8
> cpufreq_register_notifier+0xa4/0xe4
> __cpufreq_cooling_register+0x4cc/0x578
> cpufreq_cooling_register+0x20/0x24
> imx_thermal_probe+0x1c4/0x5f4 [imx_thermal]
> platform_drv_probe+0x58/0xb8
> driver_probe_device+0x204/0x2c8
> __driver_attach+0xbc/0xc0
> bus_for_each_dev+0x5c/0x90
> driver_attach+0x24/0x28
> bus_add_driver+0xf4/0x200
> driver_register+0x80/0xfc
> __platform_driver_register+0x48/0x4c
> 0xbf04d018
> do_one_initcall+0x44/0x170
> do_init_module+0x68/0x1d8
> load_module+0x1968/0x208c
> SyS_finit_module+0x94/0xa0
> ret_fast_syscall+0x0/0x1c
> -> #0 (cooling_list_lock){+.+...}:
> lock_acquire+0xd8/0x250
> __mutex_lock+0x58/0x930
> mutex_lock_nested+0x24/0x2c
> cpufreq_thermal_notifier+0x2c/0xcc
> notifier_call_chain+0x4c/0x8c
> __blocking_notifier_call_chain+0x50/0x68
> blocking_notifier_call_chain+0x20/0x28
> cpufreq_set_policy+0x74/0x1a4
> store_scaling_governor+0x68/0x84
> store+0x70/0x94
> sysfs_kf_write+0x54/0x58
> kernfs_fop_write+0x138/0x204
> __vfs_write+0x34/0x11c
> vfs_write+0xac/0x16c
> SyS_write+0x44/0x90
> ret_fast_syscall+0x0/0x1c
>
> other info that might help us debug this:
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock((cpufreq_policy_notifier_list).rwsem);
> lock(cooling_list_lock);
> lock((cpufreq_policy_notifier_list).rwsem);
> lock(cooling_list_lock);
>
> *** DEADLOCK ***
This broke it:
commit ae606089621ef0349402cfcbeca33a82abbd0fd0
Author: Matthew Wilcox <mawilcox at microsoft.com>
Date: Wed Dec 21 09:47:05 2016 -0800
thermal: convert cpu_cooling to use an IDA
thermal cpu cooling does not use the ability to look up pointers by ID,
so convert it from using an IDR to the more space-efficient IDA.
The cooling_cpufreq_lock was being used to protect cpufreq_dev_count as
well as the IDR. Rather than keep the mutex to protect a single integer,
I expanded the scope of cooling_list_lock to also cover cpufreq_dev_count.
We could also convert cpufreq_dev_count into an atomic.
Signed-off-by: Matthew Wilcox <mawilcox at microsoft.com>
Signed-off-by: Zhang Rui <rui.zhang at intel.com>
Matthew? Rui?
Thanks,
Rafael
> 6 locks held by ondemand/1005:
> #0: (sb_writers#6){.+.+.+}, at: [<c017cf38>] vfs_write+0x150/0x16c
> #1: (&of->mutex){+.+.+.}, at: [<c01fbd7c>] kernfs_fop_write+0xf8/0x204
> #2: (s_active#135){.+.+.+}, at: [<c01fbd84>] kernfs_fop_write+0x100/0x204
> #3: (cpu_hotplug.dep_map){++++++}, at: [<c0034028>] get_online_cpus+0x34/0xa8
> #4: (&policy->rwsem){+++++.}, at: [<c052edd8>] store+0x5c/0x94
> #5: ((cpufreq_policy_notifier_list).rwsem){++++..}, at: [<c0058ff8>] __blocking_notifier_call_chain+0x34/0x68
>
> stack backtrace:
> CPU: 1 PID: 1005 Comm: ondemand Not tainted 4.11.0-rc1+ #2121
> Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> Backtrace:
> [<c0013ba4>] (dump_backtrace) from [<c0013de4>] (show_stack+0x18/0x1c)
> r6:600e0093 r5:ffffffff r4:00000000 r3:00000000
> [<c0013dcc>] (show_stack) from [<c033ea48>] (dump_stack+0xa4/0xdc)
> [<c033e9a4>] (dump_stack) from [<c011db2c>] (print_circular_bug+0x28c/0x2e0)
> r6:c0bf2d84 r5:c0bf2e94 r4:c0bf2d84 r3:c09e84a8
> [<c011d8a0>] (print_circular_bug) from [<c008b08c>] (__lock_acquire+0x16c8/0x17b0)
> r10:ee75aa48 r8:00000006 r7:c0a531c8 r6:ee75aa28 r5:ee75a4c0 r4:c141aa58
> [<c00899c4>] (__lock_acquire) from [<c008b6d8>] (lock_acquire+0xd8/0x250)
> r10:00000000 r9:c0a8a8a4 r8:00000000 r7:00000000 r6:c0a74fe4 r5:600e0013
> r4:00000000
> [<c008b600>] (lock_acquire) from [<c070ce18>] (__mutex_lock+0x58/0x930)
> r10:00000002 r9:00000000 r8:c141aa58 r7:edc8bca0 r6:00000000 r5:00000000
> r4:c0a74fb0
> [<c070cdc0>] (__mutex_lock) from [<c070d798>] (mutex_lock_nested+0x24/0x2c)
> r10:00000008 r9:00000000 r8:00000000 r7:edc8bca0 r6:00000000 r5:c0a74fb0
> r4:edc8bca0
> [<c070d774>] (mutex_lock_nested) from [<c052d074>] (cpufreq_thermal_notifier+0x2c/0xcc)
> [<c052d048>] (cpufreq_thermal_notifier) from [<c0058c54>] (notifier_call_chain+0x4c/0x8c)
> r5:00000000 r4:ffffffff
> [<c0058c08>] (notifier_call_chain) from [<c0059014>] (__blocking_notifier_call_chain+0x50/0x68)
> r8:d014e400 r7:00000000 r6:edc8bca0 r5:ffffffff r4:c0a7521c r3:ffffffff
> [<c0058fc4>] (__blocking_notifier_call_chain) from [<c005904c>] (blocking_notifier_call_chain+0x20/0x28)
> r7:c1437a4c r6:00000000 r5:d014e400 r4:edc8bca0
> [<c005902c>] (blocking_notifier_call_chain) from [<c05318d0>] (cpufreq_set_policy+0x74/0x1a4)
> [<c053185c>] (cpufreq_set_policy) from [<c0531a68>] (store_scaling_governor+0x68/0x84)
> r8:d014e400 r7:c0a75410 r6:00000008 r5:d8f83480 r4:d014e400 r3:00000000
> [<c0531a00>] (store_scaling_governor) from [<c052edec>] (store+0x70/0x94)
> r6:d8f83480 r5:00000008 r4:d014e4e0
> [<c052ed7c>] (store) from [<c01fcc54>] (sysfs_kf_write+0x54/0x58)
> r8:00000000 r7:d8f83480 r6:d8f83480 r5:00000008 r4:d01c1240 r3:00000008
> [<c01fcc00>] (sysfs_kf_write) from [<c01fbdbc>] (kernfs_fop_write+0x138/0x204)
> r6:d01c1240 r5:d01c1250 r4:00000000 r3:ee75a4c0
> [<c01fbc84>] (kernfs_fop_write) from [<c017b6f0>] (__vfs_write+0x34/0x11c)
> r10:809f5d08 r9:edc8a000 r8:00000008 r7:edc8bf78 r6:d5af4b40 r5:809f5d08
> r4:c071eebc
> [<c017b6bc>] (__vfs_write) from [<c017ce94>] (vfs_write+0xac/0x16c)
> r8:edc8bf78 r7:00000000 r6:809f5d08 r5:00000008 r4:d5af4b40
> [<c017cde8>] (vfs_write) from [<c017d144>] (SyS_write+0x44/0x90)
> r10:809f5d08 r8:00000008 r7:d5af4b40 r6:d5af4b40 r5:00000000 r4:00000000
> [<c017d100>] (SyS_write) from [<c000fd60>] (ret_fast_syscall+0x0/0x1c)
> r10:00000000 r8:c000ff04 r7:00000004 r6:7f8a8d08 r5:809f5d08 r4:00000008
>
> --
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.
More information about the linux-arm-kernel
mailing list