[PATCH 1/3] thermal: handle get_temp() errors properly

Brian Norris briannorris at chromium.org
Fri Nov 18 21:30:15 PST 2016


Hi,

On Fri, Nov 18, 2016 at 07:41:59PM -0800, Eduardo Valentin wrote:
> On Fri, Nov 18, 2016 at 03:52:55PM -0800, Brian Norris wrote:
> > If using CONFIG_THERMAL_EMULATION, there's a corner case where we might
> > get an error from the zone's get_temp() callback, but we'll ignore that
> > and keep using its value. Let's just error out properly instead.
> > 
> > Signed-off-by: Brian Norris <briannorris at chromium.org>
> > ---
> >  drivers/thermal/thermal_core.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> > index 911fd964c742..0fa497f10d25 100644
> > --- a/drivers/thermal/thermal_core.c
> > +++ b/drivers/thermal/thermal_core.c
> > @@ -494,6 +494,8 @@ int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
> >  	mutex_lock(&tz->lock);
> >  
> >  	ret = tz->ops->get_temp(tz, temp);
> > +	if (ret)
> > +		goto exit_unlock;
> 
> Yeah, but the follow through is intentional, if I am not mistaken.

OK...but it has a bug. It potentially utilizes an uninitialized value
for *temp.

> >  
> >  	if (IS_ENABLED(CONFIG_THERMAL_EMULATION) && tz->emul_temperature) {
> 
> Even if the driver is not able to read real temperature, but emul temp
> is configured, then there is still opportunity to report the emulated
> temperature.

OK, maybe, but you should avoid doing this comparison then:

513                 if (!ret && *temp < crit_temp)
514                         *temp = tz->emul_temperature;

Note that 'ret' might be 0 (from the calls to ->get_trip_type()), and then
you're comparing with the uninitialized value of *temp. So you need some
solution that accounts for this and decides to ignore the real
temperature properly.

> >  		for (count = 0; count < tz->trips; count++) {
> > @@ -514,6 +516,7 @@ int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
> >  			*temp = tz->emul_temperature;
> 
> And if you check the lines at the bottom of the loop, you will see that,
> in the fail case, we will stil compare to what is the content of temp,
> which might be problematic.

Yes...are you saying the same thing I am above?

> I would prefer we consider the patch I sent
> some time ago:
> https://patchwork.kernel.org/patch/7876381/

Honestly I didn't look that deeply into the framework here (and I also
don't use CONFIG_THERMAL_EMULATION), I was just fixing something that
was obviously wrong.

But on first read, that patch looks good to me -- although it'd be good
to note the uninitialized value fix in the comit log. Any reason that
didn't end up getting merged? It looks like it got reviewed, and you're
a thermal subsystem maintainer...

Brian



More information about the Linux-rockchip mailing list