[PATCH] cpuidle: mvebu: Fix the CPU PM notifier usage

Daniel Lezcano daniel.lezcano at linaro.org
Tue Mar 3 07:20:26 PST 2015


On 03/03/2015 03:58 PM, Fulvio wrote:
>
>> I didn't know you experimented random kernel panics and that you thought
>> it was related to the CPU Idle driver.
>>
>>> All i can say is that the system use the "armadaxp_idle" driver and
>>> works fine when running "stress --cpu 8" in background.
>>> I asked Netgear to provide a firmware without the idle driver to
>>> confirm if it's the cause of the problem, but they did not answered.
>>
>> I think that if you disable all the state using by doing an
>>
>> echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/stateNUM/disable
>>
>> where NUM is the nuymbert of the state. Then it should disable the
>> cpuidle on the fly.
> Thanks, i'll try that as soon as possible (i gave back the unit to my
> client) and report back.
>
> However, the description of the cpu_pm_enter function state:
> "Must be called on the affected CPU with interrupts disabled.  Platform
> is responsible for ensuring that cpu_pm_enter is not called twice on the
> same CPU before cpu_pm_exit is called. Notified drivers can include VFP
> co-processor, interrupt controller and its PM extensions, local CPU
> timers context save/restore which shouldn't be interrupted. Hence it
> must be called with interrupts disabled."
>
> and the point is: it that an invariant? Do current code and future code
> safely assume that cpu_pm_enter is not called twice?

The fix is correct. The cpu_pm_enter/exit symmetry must be kept because 
we don't know what the notifier clients are doing.

The point is : can we send it to stable@ as a bug fix or not ?

> For example if cpu_pm_enter do "context save" and cpu_pm_exit do
> "context restore", calling twice cpu_pm_enter will overwrite the
> previous saved context: is that safe in all circumstances?

That is the drawback of the notifiers: the kernel provides a service and 
everyone plug something on it. The cpu_pm notifier are very low level 
functions, so the answer of your question is not obvious. I already 
checked all the cpuidle drivers if the potential bug you reported is 
there or not but apparently everything else is fine, cpu_pm_exit is 
always called after cpu_pm_enter.

As you stated, the API description implies cpu_pm_exit must be called 
after cpu_pm_enter. So the fix is right.

> I assume the rule " It must fix a real bug that bothers people (not a,
> "This could be a problem..." type thing)." is to avoid committing
> useless changes that may introduce new bugs, but i do not think that
> apply to this case: a bug report from an unknown user (me) should change
> nothing.

It would be perfect if we can succeed to reproduce the bug you are 
facing and check the patch fixes it. In this case, it goes to stable@ 
automatically.


-- 
  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog




More information about the linux-arm-kernel mailing list