[PATCH] nvme: hwmon: Add support for throttling temperature feature

Tokunori Ikegami ikegami.t at gmail.com
Sun Aug 7 08:02:26 PDT 2022


On 2022/08/07 5:19, Armin Wolf wrote:
> Am 06.08.22 um 13:58 schrieb Tokunori Ikegami:
>
>> Note: Sorry let me resend the mail below as text format since it was
>> not delivered to the mailing lists as contained HTML subpart.
>>
>> Hi,
>>
>> Thanks for your comments.
>>
>> On 2022/08/06 17:31, Guenter Roeck wrote:
>>> On Sat, Aug 06, 2022 at 02:46:06PM +0900, Tokunori Ikegami wrote:
>>>> NVMe drives support host controlled thermal management feature as
>>>> optional.
>>>> The thermal management temperature are different from the
>>>> temperature threshold.
>>>> So add functionality to set the throttling temperature values.
>>>>
>>>> Signed-off-by: Tokunori Ikegami <ikegami.t at gmail.com>
>>
>> I think actually the suggested attributes are not met with the
>> throttling temperatures as below.
>>
>>   temp[1-*]_emergency: Temperature emergency max value, for chips
>> supporting more than two upper temperature limits.
>>   temp[1-*]_lcrit: Temperature critical min value, typically lower
>> than corresponding temp_min values.
>>
>>   Thermal Management Temperature 1 (TMT1): This field specifies the
>> temperature, in Kelvins, when the controller begins to transition to
>> lower power active power states or performs vendor specific thermal
>> management actions while minimizing the impact on performance (e.g.,
>> light throttling) in order to attempt to reduce the Composite
>> Temperature.
>>   Thermal Management Temperature 2 (TMT2): This field specifies the
>> temperature, in Kelvins, when the controller begins to transition to
>> lower power active power states or perform vendor specific thermal
>> management actions regardless of the impact on performance (e.g.,
>> heavy throttling) in order to attempt to reduce the Composite
>> Temperature.
>>
> Maybe those two throttle thresholds could be represented by tempX_crit 
> and tempX_emergency,
> the special throttle effect could be documented in the drivers 
> documentation.
>
> Since tempX_crit is already used to report CCTEMP, maybe this value 
> could be reported with tempX_rated_max instead?
> As far as i know, CCTEMP is the maximum composite temperature rating 
> of the NVME device, so reporting is as tempX_rated_max would make sense.

Thanks for your advice. But actually the throttle thresholds is lower 
than both the current tempX_max and tempX_crit by default so it seems 
that it is difficult to use the current tempX values for the throttle 
thresholds.

Regards,
Ikegami

>
> Armin Wolf
>
>>> NACK. There are several existing limit attributes which can be used
>>> for this purpose. I would suggest to use EMERGENCY and LCRIT 
>>> attributes.
>>>
>>> Furthermore, one can not just extend the hwmon ABI without discussion,
>>> much less as part of a patch introducing its use. Any attribute
>>> introduced
>>> into the ABI must benefit more than one device, and a matching
>>> implementation in the sensors command and the lm-sensors library is
>>> expected.
>>
>> Sorry I am not sure about the hwmon ABI situation but if possible
>> could you please consider or discuss to extend the attributes from
>> this patch review since the suggested attributes seem difficult to use
>> instead? (Is it difficult?)
>> By the way I have already created the lm-sensors pull request below.
>>   <https://github.com/lm-sensors/lm-sensors/pull/406>
>>
>> Regards,
>> Ikegami
>>
>>>
>>> Guenter



More information about the Linux-nvme mailing list