[PATCH] ARM: OMAP2+: omap_device: maintain sane runtime pm status around suspend/resume

Kevin Hilman khilman at linaro.org
Thu Nov 7 18:44:33 EST 2013


Nishanth Menon <nm at ti.com> writes:

> OMAP device hooks around suspend|resume_noirq ensures that hwmod
> devices are forced to idle using omap_device_idle/enable as part of
> the last stage of suspend activity.
>
> For a device such as i2c who uses autosuspend, it is possible to enter
> the suspend path with dev->power.runtime_status = RPM_ACTIVE.
>
> As part of the suspend flow, the generic runtime logic would increment
> it's dev->power.disable_depth to 1. This should prevent further
> pm_runtime_get_sync from succeeding once the runtime_status has been
> set to RPM_SUSPENDED.
>
> Now, as part of the suspend_noirq handler in omap_device, we force the
> following: if the device status is !suspended, we force the device
> to idle using omap_device_idle (clocks are cut etc..). This ensures
> that from a hardware perspective, the device is "suspended". However,
> runtime_status is left to be active.
>
> *if* an operation is attempted after this point to
> pm_runtime_get_sync, runtime framework depends on runtime_status to
> indicate accurately the device status, and since it sees it to be
> ACTIVE, it assumes the module is functional and returns a non-error
> value. As a result the user will see pm_runtime_get succeed, however a
> register access will crash due to the lack of clocks.

Ouch.

Dumb Q: who is requesting an i2c transaction after ->suspend_noirq().
The i2c driver itself should be able to detect that it's being accessed
after this point and return an error.

That being said, I agree that omap_device should still be catching this
case in order to find/fix driver races like this.

> To prevent this from happening, we should ensure that runtime_status
> exactly indicates the device status. As a result of this change
> any further calls to pm_runtime_get* would return -EACCES (since
> disable_depth is 1). On resume, we restore the clocks and runtime
> status exactly as we suspended with.
>
> Reported-by: J Keerthy <j-keerthy at ti.com>
> Signed-off-by: Nishanth Menon <nm at ti.com>
> Acked-by: Rajendra Nayak <rnayak at ti.com>
> ---
> patch baseline: V3.12 tag (also applies on linux-next next-20131107 tag)
>
> Logs from 3.12 based vendor kernel:
> Before: http://pastebin.com/m5KxnB7a
> After: http://pastebin.com/8AfX4e7r
>
> The discussion on cpufreq side of the story which triggered this scenario:
> 	http://marc.info/?l=linux-pm&m=138263811321921&w=2
>
> Tested on TI vendor kernel (with dt boot):
> 	AM335x: evm, BBB, sk, BBW
> 	OMAP5uEVM, DRA7-evm
>
>  arch/arm/mach-omap2/omap_device.c |   16 ++++++++++++++--
>  arch/arm/mach-omap2/omap_device.h |    1 +
>  2 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/mach-omap2/omap_device.c b/arch/arm/mach-omap2/omap_device.c
> index b69dd9a..87ecbb0 100644
> --- a/arch/arm/mach-omap2/omap_device.c
> +++ b/arch/arm/mach-omap2/omap_device.c
> @@ -621,6 +621,13 @@ static int _od_suspend_noirq(struct device *dev)
>  
>  	if (!ret && !pm_runtime_status_suspended(dev)) {
>  		if (pm_generic_runtime_suspend(dev) == 0) {
> +			if (!pm_runtime_suspended(dev)) {

Why the addition check for supended here?

This version (as opposed to the _status_suspended() version above will
fail if runtime PM has been disabled from userspace (via
/sys/devices/.../power/control), and will thus prevent low power states
from being hit in suspend if runtime suspend has been disabled from
userspace.  That's a bug.

> +				/* NOTE: *might* indicate driver race */

Yes, a driver race which should then be fixed in the driver.

> +				dev_dbg(dev, "%s: Force suspending\n",
> +					__func__);
> +				pm_runtime_set_suspended(dev);
> +				od->flags |= OMAP_DEVICE_SUSPEND_FORCED;

Not sure why you need an additonal flag.  Why not just always do this
and use the existin OMAP_DEVICE_SUSPENDED flag.

Kevin

> +			}
>  			omap_device_idle(pdev);
>  			od->flags |= OMAP_DEVICE_SUSPENDED;
>  		}
> @@ -634,10 +641,15 @@ static int _od_resume_noirq(struct device *dev)
>  	struct platform_device *pdev = to_platform_device(dev);
>  	struct omap_device *od = to_omap_device(pdev);
>  
> -	if ((od->flags & OMAP_DEVICE_SUSPENDED) &&
> -	    !pm_runtime_status_suspended(dev)) {
> +	if (od->flags & OMAP_DEVICE_SUSPENDED) {
>  		od->flags &= ~OMAP_DEVICE_SUSPENDED;
>  		omap_device_enable(pdev);
> +
> +		if (od->flags & OMAP_DEVICE_SUSPEND_FORCED) {
> +			pm_runtime_set_active(dev);
> +			od->flags &= ~OMAP_DEVICE_SUSPEND_FORCED;
> +		}
> +
>  		pm_generic_runtime_resume(dev);
>  	}
>  
> diff --git a/arch/arm/mach-omap2/omap_device.h b/arch/arm/mach-omap2/omap_device.h
> index 17ca1ae..45885b0 100644
> --- a/arch/arm/mach-omap2/omap_device.h
> +++ b/arch/arm/mach-omap2/omap_device.h
> @@ -38,6 +38,7 @@ extern struct dev_pm_domain omap_device_pm_domain;
>  
>  /* omap_device.flags values */
>  #define OMAP_DEVICE_SUSPENDED		BIT(0)
> +#define OMAP_DEVICE_SUSPEND_FORCED	BIT(1)
>  
>  /**
>   * struct omap_device - omap_device wrapper for platform_devices



More information about the linux-arm-kernel mailing list