[PATCH v4 05/14] soc: mediatek: mtk-svs: use svs clk control APIs

Matthias Brugger matthias.bgg at gmail.com
Mon Feb 6 04:09:41 PST 2023



On 06/02/2023 03:01, Roger Lu (陸瑞傑) wrote:
> Hi Matthias Sir,
> 
> 
> On Thu, 2023-02-02 at 11:29 +0100, Matthias Brugger wrote:
>> 你好,
> 
> I got shock and thought someone used your name to reply. However,
> your email account helps me clear my mind. Haha.. Nice and warm to see mandarin
> on patchwork. It's so fresh and exciting :-).
> 

谢谢。 I'm learning mainland Chinese for a few month now, I also learned that you 
use different symbols in Taiwan, which I don't know. 对不起。

>>
>> On 01/02/2023 13:28, Roger Lu (陸瑞傑) wrote:
>>> Hi Matthias Sir,
>>>
>>> On Tue, 2023-01-31 at 14:19 +0100, Matthias Brugger wrote:
>>>>
>>>> On 11/01/2023 08:45, Roger Lu wrote:
>>>>> In MediaTek HW design, svs and thermal both use the same clk source.
>>>>> It means that svs clk reference count from CCF includes thermal control
>>>>> counts. That makes svs driver confuse whether it disabled svs's main clk
>>>>> or not from CCF's perspective and lead to turn off their shared clk
>>>>> unexpectedly. Therefore, we add svs clk control APIs to make sure svs's
>>>>> main clk is controlled well by svs driver itself.
>>>>>
>>>>> Here is a NG example. Rely on CCF's reference count and cause problem.
>>>>>
>>>>> thermal probe (clk ref = 1)
>>>>> -> svs probe (clk ref = 2)
>>>>>       -> svs suspend (clk ref = 1)
>>>>>          -> thermal suspend (clk ref = 0)
>>>>>          -> thermal resume (clk ref = 1)
>>>>>       -> svs resume (encounter error, clk ref = 1)
>>>>>       -> svs suspend (clk ref = 0)
>>>>>          -> thermal suspend (Fail here, thermal HW control w/o clk)
>>>>>
>>>>> Fixes: a825d72f74a3 ("soc: mediatek: fix missing clk_disable_unprepare()
>>>>> on
>>>>> err in svs_resume()")
>>>>> Signed-off-by: Roger Lu <roger.lu at mediatek.com>
>>>>
>>>> That looks wrong. Although I don't out of my mind, there should be a way
>>>> to
>>>> tell
>>>> the clock framework that this clock is shared between several devices.
>>>>
>>>> I wonder if using clk_enable and clk_disable in svs_resume/suspend
>>>> wouldn't
>>>> be
>>>> enough.
>>>
>>> Oh yes, Common Clock Framework (CCF) knows the clock shared between several
>>> devices and maintains clock "on/off" by reference count.
>>>
>>
>> The thing is if you use clk_prepare_enable then the clock framework check's
>> if
>> the clock is already prepared, which could happen like you described in the
>> svs_resume (encount error) case in the commit message. The question is, can't
>> we
>> just use clk_enable and clk_disable in resume/suspend and only prepare the
>> clock
>> in the probe function?
> 
> We'll think if this can fix the problem. Thanks for the advice very much.
> 
>>
>>> We concern how to stop running svs_suspend() when svs clk is already
>>> disabled by
>>> svs_resume(). Take an example as below, if we refers to __clk_is_enabled()
>>> result for knowing svs clk status, it will return "true" all the time
>>> because
>>> thermal clk is still on. This causes the problem mentioned in commit
>>> message.
>>>
>>
>> I would expect that the kernel takes care that we can't enter a resume path
>> for
>> a device before the suspend path has finished. Honestly I don't really
>> understand the problem here. It seems something different then what you
>> described in the commit message.
>>
>> Please help me understand better.
> 
> I see. This patch title needs to be changed to "avoid turning off svs clk twice
> unexpectedly" for pointing out the problem precisely. We saw a loophole that svs
> clk might be turned off in svs_resume() first and in svs_suspend() again without
> enabling svs clk during these the process. Therefore, we try to fix it by this
> patch. Below is our thinking process to explain how we got here.
> 
> 1. (abandoned) We add __clk_is_enabled() check in svs_suspend() to prevent svs
> clk from being turned off twice when svs_resume() turned off svs clk in the
> error-handling process. Nonetheless, we met the NG case in the commit message.
> 2. (current patch) We add svs clk control hint to understand if we need to run
> svs_suspend() or not if svs_resume() turned off svs clk before.
> 

Did you had a look on the dev_pm_ops? Maybe we can use suspend_late, 
resume_early to make sure there is no race condition. I wonder also if we can't 
make sure that this does not happen using device links. Sorry, I can't give 
better guidance on how to use this technologies, but I have the feeling we can 
fix this with existing infrastructure.

再见。

Matthias

>>
>> 谢谢,再见
> 
> :-)
> 
> 
> Sincerely,
> Roger Lu



More information about the linux-arm-kernel mailing list