[PATCH v2 1/3] wifi: mt76: mt7915: rework mt7915_thermal_set_cur_throttle_state()
Howard-YH Hsu (許育豪)
Howard-YH.Hsu at mediatek.com
Thu Dec 8 04:44:11 PST 2022
On Wed, 2022-12-07 at 09:15 +0100, Nicolas Cavallari wrote:
> On 07/12/2022 06:24, Howard Hsu wrote:
> > This patch includes 3 changes:
> > 1. The maximum throttle state can be set to 100 to fix the problem
> > that
> > thermal_protect_disable can never be triggered.
>
> You are modifying the cooling_device part. The cooling_device is
> explicitly configured to have a max state of
> MT7915_CDEV_THROTTLE_MAX
> (=99), so the thermal subsystem will probably prevent
> mt7915_thermal_set_cur_throttle_state from being called with a
> higher
> value. It will also probably complain if get_cur_state starts
> returning
> values above MT7915_CDEV_THROTTLE_MAX.
>
> And, as the comment below indicates, the thermal subsystem expect
> that a
> higher state provide more cooling. So if 99 means "maximum
> cooling",
> 100 cannot mean "disable cooling".
>
> Also, last time I tried, thermal_protect_disable didn't work; It
> didn't
> disable anything, the previous thermal throttle kept being applied.
> Maybe a new firmware fixed this, but the kernel cannot simply expect
> the
> firmware to be up to date.
>
Thanks for your comments. Let me give you an example to confirm with
you if I understand your comments correctly.
1. The current cooling state of the cooling device is 50 (cur_state =
50).
2. The cooling state is set to 100 for "disable cooling".
3. The thermal subsystem decides to decrease state because the rest of
system is cooler. And then it will adjust it downward based on
cur_state, which is 100. For example, thermal subsytem set cur_state to
90. But obviously this will make the performance worse than at step 1,
even though the system is cooler. The design for 100 mean "disable
cooling" will mess up the thermal governor.
Let me know if there is any misunderstanding. And I will remove the
first change of this patch.
> > 2. Throttle state do not need to be different from the previous
> > state.
> > This will make it is impossible for users to just change the
> > trigger/restore temp but not the throttle state.
>
> The throttle state is mostly set by the kernel's thermal governor
> and
> the user has only very little control over it. The thermal governor
> runs every X seconds and will change the state if it thinks it is
> too
> low or too high.
>
> The default step_wise governor will aggressively set it to zero if
> the
> system isn't overheating, for example.
>
I don't think there is any conflict between your comment and second
change. If we keep the check that previous cooling state shall be
different from the new cooling state, this will bother users who only
wants to change the temp1_crit but not the cur_state. It is
unreasonable for the user, if they wants the new temp1_crit to take
effect in the firmware, they must set a differnt cooling state.
> > 3. Add dev_err so that it is easier to see invalid setting while
> > looking at dmesg.
> >
> > Fixes: 771cd8d4c369 ("mt76: mt7915e: Fix degraded performance after
> > temporary overheat")
> > Co-developed-by: Ryder Lee <ryder.lee at mediatek.com>
> > Signed-off-by: Ryder Lee <ryder.lee at mediatek.com>
> > Signed-off-by: Howard Hsu <howard-yh.hsu at mediatek.com>
> > ---
> > .../net/wireless/mediatek/mt76/mt7915/init.c | 18 ++++++++++---
> > -----
> > 1 file changed, 10 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/init.c
> > b/drivers/net/wireless/mediatek/mt76/mt7915/init.c
> > index c810c31fbd6e..abeecf15f1c8 100644
> > --- a/drivers/net/wireless/mediatek/mt76/mt7915/init.c
> > +++ b/drivers/net/wireless/mediatek/mt76/mt7915/init.c
> > @@ -131,14 +131,17 @@ mt7915_thermal_set_cur_throttle_state(struct
> > thermal_cooling_device *cdev,
> > u8 throttling = MT7915_THERMAL_THROTTLE_MAX - state;
> > int ret;
> >
> > - if (state > MT7915_CDEV_THROTTLE_MAX)
> > + if (state > MT7915_THERMAL_THROTTLE_MAX) {
> > + dev_err(phy->dev->mt76.dev,
> > + "please specify a valid throttling state\n");
> > return -EINVAL;
> > + }
> >
> > - if (phy->throttle_temp[0] > phy->throttle_temp[1])
> > - return 0;
> > -
> > - if (state == phy->cdev_state)
> > - return 0;
> > + if (phy->throttle_temp[0] > phy->throttle_temp[1]) {
> > + dev_err(phy->dev->mt76.dev,
> > + "temp1_crit shall not be greater than
> > temp1_max\n");
> > + return -EINVAL;
> > + }
> >
> > /*
> > * cooling_device convention: 0 = no cooling, more = more
> > cooling
>
> ^^^^^^^^^^^^^^^^^^^^^^^^^
>
More information about the Linux-mediatek
mailing list