[PATCH v2] arm64: dts: rockchip: enable built-in thermal monitoring on rk3588
Daniel Lezcano
daniel.lezcano at linaro.org
Tue Jan 23 16:14:51 PST 2024
On 23/01/2024 20:47, Alexey Charkov wrote:
> On Mon, Jan 22, 2024 at 4:04 AM Daniel Lezcano
> <daniel.lezcano at linaro.org> wrote:
>>
>>
>> Hi Alexey,
>>
>>
>> On 21/01/2024 20:57, Alexey Charkov wrote:
>>> On Fri, Jan 19, 2024 at 8:21 PM Daniel Lezcano
>>> <daniel.lezcano at linaro.org> wrote:
>>> Hello Daniel,
>>>
>>> Thanks a lot for your review and comments! Please see some reflections below.
>>>
>>>> On 09/01/2024 20:19, Alexey Charkov wrote:
>>>>> Include thermal zones information in device tree for rk3588 variants
>>>>> and enable the built-in thermal sensing ADC on RADXA Rock 5B
>>>>>
>>>>> Signed-off-by: Alexey Charkov <alchark at gmail.com>
>>>>> ---
>>>>> Changes in v2:
>>>>> - Dropped redundant comments
>>>>> - Included all CPU cores in cooling maps
>>>>> - Split cooling maps into more granular ones utilizing TSADC
>>>>> channels 1-3 which measure temperature by separate CPU clusters
>>>>> instead of channel 0 which measures the center of the SoC die
>>>>> ---
>>>>> .../boot/dts/rockchip/rk3588-rock-5b.dts | 4 +
>>>>> arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 151 ++++++++++++++++++
>>>>> 2 files changed, 155 insertions(+)
>>>>>
>>>>> diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
>>>>> index a5a104131403..f9d540000de3 100644
>>>>> --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
>>>>> +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
>>>>> @@ -772,3 +772,7 @@ &usb_host1_ehci {
>>>>> &usb_host1_ohci {
>>>>> status = "okay";
>>>>> };
>>>>> +
>>>>> +&tsadc {
>>>>> + status = "okay";
>>>>> +};
>>>>> diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
>>>>> index 8aa0499f9b03..8d54998d0ecc 100644
>>>>> --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
>>>>> +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
>>>>> @@ -10,6 +10,7 @@
>>>>> #include <dt-bindings/reset/rockchip,rk3588-cru.h>
>>>>> #include <dt-bindings/phy/phy.h>
>>>>> #include <dt-bindings/ata/ahci.h>
>>>>> +#include <dt-bindings/thermal/thermal.h>
>>>>>
>>>>> / {
>>>>> compatible = "rockchip,rk3588";
>>>>> @@ -2112,6 +2113,156 @@ tsadc: tsadc at fec00000 {
>>>>> status = "disabled";
>>>>> };
>>>>>
>>>>> + thermal_zones: thermal-zones {
>>>>> + /* sensor near the center of the whole chip */
>>>>> + soc_thermal: soc-thermal {
>>>>> + polling-delay-passive = <20>;
>>>>
>>>> There is no mitigation set for this thermal zone. It is pointless to
>>>> specify a passive polling.
>>>
>>> Indeed, it makes sense to me. There seems to be a catch though in that
>>> the driver calls the generic thermal_of_zone_register during the
>>> initial probe, which expects both of those polling delays to be
>>> present in the device tree, otherwise it simply refuses to add the
>>> respective thermal zone, see drivers/thermal/thermal_of.c:502
>>
>> Usually:
>>
>> polling-delay-passive = <0>;
>> polling-delay = <0>;
>>
>> cf:
>>
>> git grep "polling-delay = <0>" arch/arm64/boot/dts
>
> For some reason when I have both polling-delay-passive and
> polling-delay set to 0, the active cooling map I have in my board DT
> (using a PWM controlled fan) behaves weirdly.
> I use the following fragment in my board DTS:
>
> +&package_thermal {
> + trips {
> + package_fan: package-fan {
> + temperature = <55000>;
> + hysteresis = <2000>;
> + type = "active";
> + };
> + };
> +
> + cooling-maps {
> + map-fan {
> + trip = <&package_fan>;
> + cooling-device = <&fan THERMAL_NO_LIMIT
> THERMAL_NO_LIMIT>;
> + };
> + };
> +};
>
> If I add polling-delay = <1000>; at the top, the fan speeds up and
> down dynamically as the package temperature swings around 55C. If I
> remove that (having set polling-delay = <0>; in rk3588s.dtsi), the fan
> speeds up to the midpoint cooling state once the package temperature
> approaches 55C, and then it just stays there forever: it doesn't speed
> up above the midpoint even as the temperature climbs above 70C, nor
> does it spin down as it falls back to around 45C.
>
> Is that the expected behavior for when the polling is disabled?
I don't know the rest of the DT this fragment was added to, but I'm not
surprised there is misbehavior because the configuration is not correct
in this case.
If there is a thermal zone with an active trip and an associated cooling
device like a fan, then:
-> polling-delay = <a_value>;
-> polling-delay-passive = <0>;
If there is a thermal zone with a passive cooling device like cpufreq
cooling device, then 2 cases:
1. The sensor supports interrupt when crossing the trip point
-> polling-delay = <0>;
-> polling-delay-passive = <a_value>;
2. The sensor does not support interrupt when crossing the trip point
-> polling-delay = <a_value>;
-> polling-delay-passive = <another_value>;
Why?
When the cooling device is a passive cooling device, then the mitigation
happens with a higher temperature sampling rate in order to change the
state of the cooling device hundred of times per second. On a fan, the
cooling effect is too slow for that so we keep the polling for that.
> I haven't yet studied in detail if passive cooling kicks in correctly
> with polling disabled, but this behavior with active cooling left me
> quite confused - any pointers would be much appreciated.
>
> Thanks a lot,
> Alexey
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
More information about the Linux-rockchip
mailing list