3.18: lockdep problems in cpufreq
Yadwinder Singh Brar
yadi.brar at samsung.com
Mon Dec 15 05:28:41 PST 2014
> -----Original Message-----
> From: Viresh Kumar [mailto:viresh.kumar at linaro.org]
> Sent: Monday, December 15, 2014 9:27 AM
> To: Rafael J. Wysocki; yadi.brar at samsung.com
> Cc: Russell King - ARM Linux; linux-arm-kernel at lists.infradead.org;
> linux-pm at vger.kernel.org; Eduardo Valentin
> Subject: Re: 3.18: lockdep problems in cpufreq
>
> On 15 December 2014 at 03:53, Rafael J. Wysocki <rjw at rjwysocki.net>
> wrote:
> > On Sunday, December 14, 2014 09:36:55 PM Russell King - ARM Linux
> wrote:
> >> Here's a nice Christmas present one of my iMX6 machines gave me
> tonight.
> >> I haven't begun to look into it.
>
> Is the culprit this one?
>
> Fixes: 2dcd851fe4b4 ("thermal: cpu_cooling: Update always cpufreq
> policy with thermal constraints")
>
> As this is what it changed:
>
> @@ -316,21 +312,28 @@ static int cpufreq_thermal_notifier(struct
> notifier_block *nb, {
> struct cpufreq_policy *policy = data;
> unsigned long max_freq = 0;
> + struct cpufreq_cooling_device *cpufreq_dev;
>
> - if (event != CPUFREQ_ADJUST || notify_device == NOTIFY_INVALID)
> + if (event != CPUFREQ_ADJUST)
> return 0;
>
> - if (cpumask_test_cpu(policy->cpu, ¬ify_device-
> >allowed_cpus))
> - max_freq = notify_device->cpufreq_val;
> - else
> - return 0;
> + mutex_lock(&cooling_cpufreq_lock);
> + list_for_each_entry(cpufreq_dev, &cpufreq_dev_list, node) {
> + if (!cpumask_test_cpu(policy->cpu,
> + &cpufreq_dev->allowed_cpus))
> + continue;
> +
> + if (!cpufreq_dev->cpufreq_val)
> + cpufreq_dev->cpufreq_val = get_cpu_frequency(
> + cpumask_any(&cpufreq_dev-
> >allowed_cpus),
> + cpufreq_dev->cpufreq_state);
>
> - /* Never exceed user_policy.max */
> - if (max_freq > policy->user_policy.max)
> - max_freq = policy->user_policy.max;
> + max_freq = cpufreq_dev->cpufreq_val;
>
> - if (policy->max != max_freq)
> - cpufreq_verify_within_limits(policy, 0, max_freq);
> + if (policy->max != max_freq)
> + cpufreq_verify_within_limits(policy, 0,
> max_freq);
> + }
> + mutex_unlock(&cooling_cpufreq_lock);
>
> return 0;
> }
>
>
> >> ======================================================
> >> [ INFO: possible circular locking dependency detected ] 3.18.0+
> #1453
> >> Not tainted
> >> -------------------------------------------------------
> >> rc.local/770 is trying to acquire lock:
> >> (cooling_cpufreq_lock){+.+.+.}, at: [<c04abfc4>]
> >> cpufreq_thermal_notifier+0x34/0xfc
> >>
> >> but task is already holding lock:
> >> ((cpufreq_policy_notifier_list).rwsem){++++.+}, at: [<c0042f04>]
> >> __blocking_notifier_call_chain+0x34/0x68
> >>
> >> which lock already depends on the new lock.
> >
> > Well, that's for Viresh.
>
> Maybe not as the rework I have done is queued for this merge window.
> I was afraid really after reading the "offenders" discussion on IRC :)
>
> Cc'd Yadwinder as well who wrote this patch.
Thanks :).
Unfortunately, I didn’t get any such warning though I tested
patch enabling CONFIG_PROVE_LOCKING before posting. It seems
Russell is trying to load imx_thermal as module and parallely
Changing cpufreq governor from userspace, which was not my
test case.
Anyways, after analyzing log and code,I think problem is not
in cpufreq_thermal_notifier which was modified in patch as
stated above. Actual problem is in __cpufreq_cooling_register
which is unnecessarily calling cpufreq_register_notifier()
inside section protected by cooling_cpufreq_lock.
Because cpufreq_policy_notifier_list).rwsem is already held
by store_scaling_governor when __cpufreq_cooling_register is
trying to cpufreq_policy_notifier_list while holding cooling_cpufreq_lock.
So I think following can fix the problem:
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index ad09e51..622ea40 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -484,15 +484,15 @@ __cpufreq_cooling_register(struct device_node *np,
cpufreq_dev->cpufreq_state = 0;
mutex_lock(&cooling_cpufreq_lock);
- /* Register the notifier for first cpufreq cooling device */
- if (cpufreq_dev_count == 0)
- cpufreq_register_notifier(&thermal_cpufreq_notifier_block,
- CPUFREQ_POLICY_NOTIFIER);
cpufreq_dev_count++;
list_add(&cpufreq_dev->node, &cpufreq_dev_list);
mutex_unlock(&cooling_cpufreq_lock);
+ /* Register the notifier for first cpufreq cooling device */
+ if (cpufreq_dev_count == 0)
+ cpufreq_register_notifier(&thermal_cpufreq_notifier_block,
+ CPUFREQ_POLICY_NOTIFIER);
return cool_dev;
}
Best Regards,
Yadwinder
More information about the linux-arm-kernel
mailing list