[PATCH 1/3] thermal: handle get_temp() errors properly

Eduardo Valentin edubezval at gmail.com
Tue Nov 22 03:00:47 PST 2016


On Tue, Nov 22, 2016 at 03:52:25PM +0800, Zhang Rui wrote:
> Hi, Brian,
> 
> On Fri, 2016-11-18 at 21:30 -0800, Brian Norris wrote:
> > Hi,
> > 
> > On Fri, Nov 18, 2016 at 07:41:59PM -0800, Eduardo Valentin wrote:
> > > 
> > > On Fri, Nov 18, 2016 at 03:52:55PM -0800, Brian Norris wrote:
> > > > 
> > > > If using CONFIG_THERMAL_EMULATION, there's a corner case where we
> > > > might
> > > > get an error from the zone's get_temp() callback, but we'll
> > > > ignore that
> > > > and keep using its value. Let's just error out properly instead.
> > > > 
> > > > Signed-off-by: Brian Norris <briannorris at chromium.org>
> > > > ---
> > > >  drivers/thermal/thermal_core.c | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > > 
> > > > diff --git a/drivers/thermal/thermal_core.c
> > > > b/drivers/thermal/thermal_core.c
> > > > index 911fd964c742..0fa497f10d25 100644
> > > > --- a/drivers/thermal/thermal_core.c
> > > > +++ b/drivers/thermal/thermal_core.c
> > > > @@ -494,6 +494,8 @@ int thermal_zone_get_temp(struct
> > > > thermal_zone_device *tz, int *temp)
> > > >  	mutex_lock(&tz->lock);
> > > >  
> > > >  	ret = tz->ops->get_temp(tz, temp);
> > > > +	if (ret)
> > > > +		goto exit_unlock;
> > > Yeah, but the follow through is intentional, if I am not mistaken.
> > OK...but it has a bug. It potentially utilizes an uninitialized value
> > for *temp.
> > 
> Agreed.

I also agree that this section of current get_temp is buggy. That is why 
I sent the patch some time ago.

> > > 
> > > > 
> > > >  
> > > >  	if (IS_ENABLED(CONFIG_THERMAL_EMULATION) && tz-
> > > > >emul_temperature) {
> > > Even if the driver is not able to read real temperature, but emul
> > > temp
> > > is configured, then there is still opportunity to report the
> > > emulated
> > > temperature.
> > OK, maybe, but you should avoid doing this comparison then:
> > 
> > 513                 if (!ret && *temp < crit_temp)
> > 514                         *temp = tz->emul_temperature;
> > 
> > Note that 'ret' might be 0 (from the calls to ->get_trip_type()), and
> > then
> > you're comparing with the uninitialized value of *temp. So you need
> > some
> > solution that accounts for this and decides to ignore the real
> > temperature properly.
> > 
> right.
> > > 
> > > > 
> > > >  		for (count = 0; count < tz->trips; count++) {
> > > > @@ -514,6 +516,7 @@ int thermal_zone_get_temp(struct
> > > > thermal_zone_device *tz, int *temp)
> > > >  			*temp = tz->emul_temperature;
> > > And if you check the lines at the bottom of the loop, you will see
> > > that,
> > > in the fail case, we will stil compare to what is the content of
> > > temp,
> > > which might be problematic.
> > Yes...are you saying the same thing I am above?

Yes, Brian, we are concerned about the same bug.


> > 
> > > 
> > > I would prefer we consider the patch I sent
> > > some time ago:
> > > https://patchwork.kernel.org/patch/7876381/
> > Honestly I didn't look that deeply into the framework here (and I
> > also
> > don't use CONFIG_THERMAL_EMULATION), I was just fixing something that
> > was obviously wrong.

Yeah, but that is why we need people to look the code considering all
features. :-)


> > 
> > But on first read, that patch looks good to me -- although it'd be
> > good
> > to note the uninitialized value fix in the comit log. Any reason that
> > didn't end up getting merged? It looks like it got reviewed, and
> > you're
> > a thermal subsystem maintainer...
> > 

I do not remember why Rui postponed it. A note of clarification, for
things that touch thermal core, I agree with Rui that they go through
his tree. Besides, I tend to avoid acking and sending my own patches
without proper review, which was not the case of that patch, that was
just postponed and fell into the cracks somehow.

> hmmm, I forgot why I missed this one in the end.
> Eduardo,
> would you mind refresh and resend the patch?

Yeah sure. I have at least three extra patch sets on thermal core on
my queue. But I would like to get first the thermal sysfs reorg in
first. This fix is one of the changes that will go on top of the thermal
sysfs reorg.

BR,

Eduardo 



More information about the Linux-rockchip mailing list