3.18: lockdep problems in cpufreq
Russell King - ARM Linux
linux at arm.linux.org.uk
Mon Dec 15 05:46:36 PST 2014
On Mon, Dec 15, 2014 at 06:58:41PM +0530, Yadwinder Singh Brar wrote:
> Unfortunately, I didn’t get any such warning though I tested
> patch enabling CONFIG_PROVE_LOCKING before posting. It seems
> Russell is trying to load imx_thermal as module and parallely
> Changing cpufreq governor from userspace, which was not my
> test case.
No. Yes, imx_thermal is a module, which is loaded by udev very early
in the userspace boot. Later on, in the rc.local, I poke the governor
from userspace.
This is evidenced by the bluetooth modules being loaded after imx_thermal:
Module Size Used by
bnep 10574 2
rfcomm 33720 0
bluetooth 293598 10 bnep,rfcomm
nfsd 90264 0
exportfs 3988 1 nfsd
hid_cypress 1763 0
snd_soc_fsl_spdif 9753 2
imx_pcm_dma 1137 1 snd_soc_fsl_spdif
imx_sdma 12885 2
imx2_wdt 3248 0
imx_thermal 5386 0
snd_soc_imx_spdif 1877 0
and the timestamp on the bluetooth message:
[ 25.503739] Bluetooth: Core ver 2.19
vs the timestamp on the lockdep message:
[ 29.499195] [ INFO: possible circular locking dependency detected ]
My rc.local does this:
# Configure cpufreq
echo 1 > /sys/devices/system/cpu/cpufreq/ondemand/io_is_busy
echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
> Anyways, after analyzing log and code,I think problem is not
> in cpufreq_thermal_notifier which was modified in patch as
> stated above. Actual problem is in __cpufreq_cooling_register
> which is unnecessarily calling cpufreq_register_notifier()
> inside section protected by cooling_cpufreq_lock.
> Because cpufreq_policy_notifier_list).rwsem is already held
> by store_scaling_governor when __cpufreq_cooling_register is
> trying to cpufreq_policy_notifier_list while holding cooling_cpufreq_lock.
>
> So I think following can fix the problem:
>
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index ad09e51..622ea40 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -484,15 +484,15 @@ __cpufreq_cooling_register(struct device_node *np,
> cpufreq_dev->cpufreq_state = 0;
> mutex_lock(&cooling_cpufreq_lock);
>
> - /* Register the notifier for first cpufreq cooling device */
> - if (cpufreq_dev_count == 0)
> - cpufreq_register_notifier(&thermal_cpufreq_notifier_block,
> - CPUFREQ_POLICY_NOTIFIER);
> cpufreq_dev_count++;
> list_add(&cpufreq_dev->node, &cpufreq_dev_list);
>
> mutex_unlock(&cooling_cpufreq_lock);
>
> + /* Register the notifier for first cpufreq cooling device */
> + if (cpufreq_dev_count == 0)
> + cpufreq_register_notifier(&thermal_cpufreq_notifier_block,
> + CPUFREQ_POLICY_NOTIFIER);
No, sorry, this patch is worse.
1. cpufreq_register_notifier() will never be called, even on the first
caller to this code: at the point where it's tested for zero, it will
be incremented to one by the previous code.
2. What happens if two threads come here?
The first thread succeeds in grabbing cooling_cpufreq_lock. The second
thread stalls waiting for cooling_cpufreq_lock to be released.
The first thread increments cpufreq_dev_count, adds to the list, and then
releases the lock. At this point, control may be passed to the second
thread (since mutex_unlock() will wake it up.) The second thread gets
into the critical region, and increments cpufreq_dev_count again.
By the time the first thread runs, cpufreq_dev_count may be two at this
point.
Unfortunately, you do need some kind of synchronisation here. If it's
not important when cpufreq_register_notifier() gets called, maybe this
can work:
bool register;
mutex_lock(&cooling_cpufreq_lock);
register = cpufreq_dev_count++ == 0;
list_add(&cpufreq_dev->node, &cpufreq_dev_list);
mutex_unlock(&cooling_cpufreq_lock);
if (register)
cpufreq_register_notifier(&thermal_cpufreq_notifier_block,
CPUFREQ_POLICY_NOTIFIER);
However, I suspect that may well be buggy if we have a cleanup path which
unregisters the notifier when cpufreq_dev_count is decremented to zero...
which we seem to have in cpufreq_cooling_unregister().
The answer may well be to have finer grained locking here:
- one lock to protect cpufreq_dev_list, which is only ever taken when
adding or removing entries from it
- a second lock to protect cpufreq_dev_count and the calls to
cpufreq_register_notifier() and cpufreq_unregister_notifier()
and you would /never/ take either of those two locks inside each other.
In other words:
mutex_lock(&cooling_list_lock);
list_add(&cpufreq_dev->node, &cpufreq_dev_list);
mutex_unlock(&cooling_list_lock);
mutex_lock(&cooling_cpufreq_lock);
if (cpufreq_dev_count++ == 0)
cpufreq_register_notifier(&thermal_cpufreq_notifier_block,
CPUFREQ_POLICY_NOTIFIER);
mutex_unlock(&cooling_cpufreq_lock);
and similar in the cleanup path. The notifier itself would only ever
take the cooling_list_lock.
--
FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up
according to speedtest.net.
More information about the linux-arm-kernel
mailing list