[PATCH 2/3] thermal: qoriq_thermal: only enable supported sensors

Daniel Lezcano daniel.lezcano at linaro.org
Wed Jun 7 01:28:59 PDT 2023


On 07/06/2023 08:01, Sebastian Krzyszkowiak wrote:
> On piątek, 2 czerwca 2023 15:11:37 CEST Daniel Lezcano wrote:
>> On 01/06/2023 11:52, Peng Fan wrote:
>>> Hi Daniel,
>>>
>>>> Subject: RE: [PATCH 2/3] thermal: qoriq_thermal: only enable supported
>>>> sensors
>>>>
>>>>> Subject: Re: [PATCH 2/3] thermal: qoriq_thermal: only enable supported
>>>>> sensors
>>>>>
>>>>> On 31/05/2023 14:05, Peng Fan wrote:
>>>>>>> Subject: Re: [PATCH 2/3] thermal: qoriq_thermal: only enable
>>>>>>> supported sensors
>>>>>>>
>>>>>>> On 16/05/2023 10:37, Peng Fan (OSS) wrote:
>>>>>>>> From: Peng Fan <peng.fan at nxp.com>
>>>>>>>>
>>>>>>>> There are MAX 16 sensors, but not all of them supported. Such as
>>>>>>>> i.MX8MQ, there are only 3 sensors. Enabling all 16 sensors will
>>>>>>>> touch reserved bits from i.MX8MQ reference mannual, and TMU will
>>>>>>>> stuck, temperature will not update anymore.
>>>>>>>>
>>>>>>>> Fixes: 45038e03d633 ("thermal: qoriq: Enable all sensors before
>>>>>>>> registering them")
>>>>>>>> Signed-off-by: Peng Fan <peng.fan at nxp.com>
>>>>>>>> ---
>>>>>>>>
>>>>>>>>      drivers/thermal/qoriq_thermal.c | 30
>>>>>>>>      +++++++++++++++++++----------
>>>>
>>>> -
>>>>
>>>>>>>>      1 file changed, 19 insertions(+), 11 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/thermal/qoriq_thermal.c
>>>>>>>> b/drivers/thermal/qoriq_thermal.c index
>>>>
>>>> b806a0929459..53748c4a5be1
>>>>
>>>>>>>> 100644
>>>>>>>> --- a/drivers/thermal/qoriq_thermal.c
>>>>>>>> +++ b/drivers/thermal/qoriq_thermal.c
>>>>>>>> @@ -31,7 +31,6 @@
>>>>>>>>
>>>>>>>>      #define TMR_DISABLE	0x0
>>>>>>>>      #define TMR_ME		0x80000000
>>>>>>>>      #define TMR_ALPF	0x0c000000
>>>>>>>>
>>>>>>>> -#define TMR_MSITE_ALL	GENMASK(15, 0)
>>>>>>>>
>>>>>>>>      #define REGS_TMTMIR	0x008	/* Temperature measurement
>>>>>>>
>>>>>>> interval Register */
>>>>>>>
>>>>>>>>      #define TMTMIR_DEFAULT	0x0000000f
>>>>>>>>
>>>>>>>> @@ -105,6 +104,11 @@ static int tmu_get_temp(struct
>>>>>>>
>>>>>>> thermal_zone_device *tz, int *temp)
>>>>>>>
>>>>>>>>      	 * within sensor range. TEMP is an 9 bit value representing
>>>>>>>>      	 * temperature in KelVin.
>>>>>>>>      	 */
>>>>>>>>
>>>>>>>> +
>>>>>>>> +	regmap_read(qdata->regmap, REGS_TMR, &val);
>>>>>>>> +	if (!(val & TMR_ME))
>>>>>>>> +		return -EAGAIN;
>>>>>>>
>>>>>>> How is this change related to what is described in the changelog?
>>>>>>
>>>>>> devm_thermal_zone_of_sensor_register will invoke get temp, since we
>>>>>> reverted the 45038e03d633 did, we need to check TMR_ME to avoid
>>>>>
>>>>> return
>>>>>
>>>>>> invalid temperature.
>>>>>>
>>>>>    From a higher perspective if the sensor won't be enabled, then the
>>>>>
>>>>> thermal zone should not be registered, the get_temp won't happen on a
>>>>> disabled sensor and this test won't be necessary, no ?
>>>
>>> After thinking more, I'd prefer current logic.
>>>
>>> We rely on devm_thermal_of_zone_register's return value to know
>>> whether there is a valid zone, then set sites bit, and after collected
>>> all site bits, we enable the thermal IP.
>>>
>>> If move the enabling thermal IP before devm_thermal_of_zone_register,
>>> We need check dtb thermal zone, to know which zone is valid for current
>>> thermal IP. This would complicate the design.
>>>
>>> So just checking the enabling bit in get temperature would be much
>>> simpler, and there just a small window before enabling thermal IP.
>>
>> If the thermal zone is not described, then the thermal zone won't be
>> created as it fails with -ENODEV and thus get_temp won't be called on a
>> disabled site, right?
> 
> That's not what the problem is. It's about zones that *will* be created -
> since the driver only knows that a thermal zone isn't described in the device
> tree after it fails registering, it can't enable the site *before* the zone
> gets registered - so it happens afterwards. That's why it needs this check to
> not return a bogus initial value before the site gets actually enabled later
> in qoriq_tmu_register_tmu_zone.

Sorry, I get the point but I don't see how that can happen:

qoriq_tmu_register_tmu_zone() calls devm_thermal_of_zone_register() for 
*all* sites regardless if they really exists or not.

Under the hood, the function devm_thermal_of_zone_register() calls 
thermal_of_zone_register(). This one fails when calling 
of_thermal_zone_find() because it does not exist and returns -ENODEV.

Hence, the thermal_zone_device_register_with_trips() is not called, the 
thermal zone is not created neither updated.

So I don't understand why the test:

+	regmap_read(qdata->regmap, REGS_TMR, &val);
+	if (!(val & TMR_ME))
+		return -EAGAIN;

is needed in the get_temp() ops as the thermal zone for this disabled 
site should not exist.

I'm not putting in question the series, just wanting to avoid a 
potential pointless check in an ops.


[ ... ]

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog




More information about the linux-arm-kernel mailing list