[PATCH V2] PM / Domains: Fix initial default state of the need_restore flag

Rafael J. Wysocki rjw at rjwysocki.net
Fri Nov 14 15:50:12 PST 2014


On Tuesday, November 11, 2014 11:07:08 AM Ulf Hansson wrote:
> The initial state of the device's need_restore flag should'nt depend on
> the current state of the PM domain. For example it should be perfectly
> valid to attach an inactive device to a powered PM domain.
> 
> The pm_genpd_dev_need_restore() API allow us to update the need_restore
> flag to somewhat cope with such scenarios. Typically that should have
> been done from drivers/buses ->probe() since it's those that put the
> requirements on the value of the need_restore flag.
> 
> Until recently, the Exynos SOCs were the only user of the
> pm_genpd_dev_need_restore() API, though invoking it from a centralized
> location while adding devices to their PM domains.
> 
> Due to that Exynos now have swithed to the generic OF-based PM domain
> look-up, it's no longer possible to invoke the API from a centralized
> location. The reason is because devices are now added to their PM
> domains during the probe sequence.
> 
> Commit "ARM: exynos: Move to generic PM domain DT bindings"
> did the switch for Exynos to the generic OF-based PM domain look-up,
> but it also removed the call to pm_genpd_dev_need_restore(). This
> caused a regression for some of the Exynos drivers.
> 
> To handle things more properly in the generic PM domain, let's change
> the default initial value of the need_restore flag to reflect that the
> state is unknown. As soon as some of the runtime PM callbacks gets
> invoked, update the initial value accordingly.
> 
> Moreover, since the generic PM domain is verifying that all device's
> are both runtime PM enabled and suspended, using pm_runtime_suspended()
> while pm_genpd_poweroff() is invoked from the scheduled work, we can be
> sure of that the PM domain won't be powering off while having active
> devices.
> 
> Do note that, the generic PM domain can still only know about active
> devices which has been activated through invoking its runtime PM resume
> callback. In other words, buses/drivers using pm_runtime_set_active()
> during ->probe() will still suffer from a race condition, potentially
> probing a device without having its PM domain being powered. That issue
> will have to be solved using a different approach.
> 
> This a log from the boot regression for Exynos5, which is being fixed in
> this patch.
> 
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 308 at ../drivers/clk/clk.c:851 clk_disable+0x24/0x30()
> Modules linked in:
> CPU: 0 PID: 308 Comm: kworker/0:1 Not tainted 3.18.0-rc3-00569-gbd9449f-dirty #10
> Workqueue: pm pm_runtime_work
> [<c0013c64>] (unwind_backtrace) from [<c0010dec>] (show_stack+0x10/0x14)
> [<c0010dec>] (show_stack) from [<c03ee4cc>] (dump_stack+0x70/0xbc)
> [<c03ee4cc>] (dump_stack) from [<c0020d34>] (warn_slowpath_common+0x64/0x88)
> [<c0020d34>] (warn_slowpath_common) from [<c0020d74>] (warn_slowpath_null+0x1c/0x24)
> [<c0020d74>] (warn_slowpath_null) from [<c03107b0>] (clk_disable+0x24/0x30)
> [<c03107b0>] (clk_disable) from [<c02cc834>] (gsc_runtime_suspend+0x128/0x160)
> [<c02cc834>] (gsc_runtime_suspend) from [<c0249024>] (pm_generic_runtime_suspend+0x2c/0x38)
> [<c0249024>] (pm_generic_runtime_suspend) from [<c024f44c>] (pm_genpd_default_save_state+0x2c/0x8c)
> [<c024f44c>] (pm_genpd_default_save_state) from [<c024ff2c>] (pm_genpd_poweroff+0x224/0x3ec)
> [<c024ff2c>] (pm_genpd_poweroff) from [<c02501b4>] (pm_genpd_runtime_suspend+0x9c/0xcc)
> [<c02501b4>] (pm_genpd_runtime_suspend) from [<c024a4f8>] (__rpm_callback+0x2c/0x60)
> [<c024a4f8>] (__rpm_callback) from [<c024a54c>] (rpm_callback+0x20/0x74)
> [<c024a54c>] (rpm_callback) from [<c024a930>] (rpm_suspend+0xd4/0x43c)
> [<c024a930>] (rpm_suspend) from [<c024bbcc>] (pm_runtime_work+0x80/0x90)
> [<c024bbcc>] (pm_runtime_work) from [<c0032a9c>] (process_one_work+0x12c/0x314)
> [<c0032a9c>] (process_one_work) from [<c0032cf4>] (worker_thread+0x3c/0x4b0)
> [<c0032cf4>] (worker_thread) from [<c003747c>] (kthread+0xcc/0xe8)
> [<c003747c>] (kthread) from [<c000e738>] (ret_from_fork+0x14/0x3c)
> ---[ end trace 40cd58bcd6988f12 ]---
> 
> Fixes: a4a8c2c4962bb655 (ARM: exynos: Move to generic PM domain DT bindings)
> Reported-by: Sylwester Nawrocki <s.nawrocki at samsung.com>
> Reviewed-by: Sylwester Nawrocki <s.nawrocki at samsung.com>
> Tested-by: Sylwester Nawrocki <s.nawrocki at samsung.com>
> Reviewed-by: Kevin Hilman <khilman at linaro.org>
> Signed-off-by: Ulf Hansson <ulf.hansson at linaro.org>
> ---
> 
> I am resending the v2, since I realized that I forgot to update the version in
> the patch header.

This patch is in the Linus' tree now.

> Changes in v2:
> 	Applied some Reviewed|Tested-by tags.
> 	Added some newlines. (Kevin)
> 	Checking for the sign instead of for a specific value. (Rafael)
> 
> 
> This patch is intended as fix for 3.18 rc[n] due to the regression for Exynos
> SOCs.
> 
> I would also like to call for help in getting this thoroughly tested.
> 
> I have tested this on Arndale Dual, Exynos 5250. According the log attached in
> the commit message as well.
> 
> I have tested this on UX500, which support for the generic PM domain is about
> to be queued for 3.19. Since UX500 uses the AMBA bus/drivers, which uses
> pm_runtime_set_active() instead of pm_runtime_get_sync() during ->probe(), I
> could also verify that this behavior is maintained.
> 
> Finally I have also tested this patchset on UX500 and using the below patchset
> to make sure the approach is suitable long term wise as well.
> [PATCH v3 0/9] PM / Domains: Fix race conditions during boot
> http://www.spinics.net/lists/arm-kernel/msg369409.html
> 
> ---
>  drivers/base/power/domain.c | 38 ++++++++++++++++++++++++++++++++------
>  include/linux/pm_domain.h   |  2 +-
>  2 files changed, 33 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index b520687..df41c69 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -361,9 +361,19 @@ static int __pm_genpd_save_device(struct pm_domain_data *pdd,
>  	struct device *dev = pdd->dev;
>  	int ret = 0;
>  
> -	if (gpd_data->need_restore)
> +	if (gpd_data->need_restore > 0)
>  		return 0;
>  
> +	/*
> +	 * If the value of the need_restore flag is still unknown at this point,
> +	 * we trust that pm_genpd_poweroff() has verified that the device is
> +	 * already runtime PM suspended.
> +	 */
> +	if (gpd_data->need_restore < 0) {
> +		gpd_data->need_restore = 1;
> +		return 0;
> +	}
> +
>  	mutex_unlock(&genpd->lock);
>  
>  	genpd_start_dev(genpd, dev);
> @@ -373,7 +383,7 @@ static int __pm_genpd_save_device(struct pm_domain_data *pdd,
>  	mutex_lock(&genpd->lock);
>  
>  	if (!ret)
> -		gpd_data->need_restore = true;
> +		gpd_data->need_restore = 1;
>  
>  	return ret;
>  }
> @@ -389,12 +399,17 @@ static void __pm_genpd_restore_device(struct pm_domain_data *pdd,
>  {
>  	struct generic_pm_domain_data *gpd_data = to_gpd_data(pdd);
>  	struct device *dev = pdd->dev;
> -	bool need_restore = gpd_data->need_restore;
> +	int need_restore = gpd_data->need_restore;
>  
> -	gpd_data->need_restore = false;
> +	gpd_data->need_restore = 0;
>  	mutex_unlock(&genpd->lock);
>  
>  	genpd_start_dev(genpd, dev);
> +
> +	/*
> +	 * Make sure to also restore those devices which we until now, didn't
> +	 * know the state for.
> +	 */
>  	if (need_restore)
>  		genpd_restore_dev(genpd, dev);
>  
> @@ -603,6 +618,7 @@ static void genpd_power_off_work_fn(struct work_struct *work)
>  static int pm_genpd_runtime_suspend(struct device *dev)
>  {
>  	struct generic_pm_domain *genpd;
> +	struct generic_pm_domain_data *gpd_data;
>  	bool (*stop_ok)(struct device *__dev);
>  	int ret;
>  
> @@ -628,6 +644,16 @@ static int pm_genpd_runtime_suspend(struct device *dev)
>  		return 0;
>  
>  	mutex_lock(&genpd->lock);
> +
> +	/*
> +	 * If we have an unknown state of the need_restore flag, it means none
> +	 * of the runtime PM callbacks has been invoked yet. Let's update the
> +	 * flag to reflect that the current state is active.
> +	 */
> +	gpd_data = to_gpd_data(dev->power.subsys_data->domain_data);
> +	if (gpd_data->need_restore < 0)
> +		gpd_data->need_restore = 0;
> +
>  	genpd->in_progress++;
>  	pm_genpd_poweroff(genpd);
>  	genpd->in_progress--;
> @@ -1442,7 +1468,7 @@ int __pm_genpd_add_device(struct generic_pm_domain *genpd, struct device *dev,
>  	mutex_lock(&gpd_data->lock);
>  	gpd_data->base.dev = dev;
>  	list_add_tail(&gpd_data->base.list_node, &genpd->dev_list);
> -	gpd_data->need_restore = genpd->status == GPD_STATE_POWER_OFF;
> +	gpd_data->need_restore = -1;
>  	gpd_data->td.constraint_changed = true;
>  	gpd_data->td.effective_constraint_ns = -1;
>  	mutex_unlock(&gpd_data->lock);
> @@ -1546,7 +1572,7 @@ void pm_genpd_dev_need_restore(struct device *dev, bool val)
>  
>  	psd = dev_to_psd(dev);
>  	if (psd && psd->domain_data)
> -		to_gpd_data(psd->domain_data)->need_restore = val;
> +		to_gpd_data(psd->domain_data)->need_restore = val ? 1 : 0;
>  
>  	spin_unlock_irqrestore(&dev->power.lock, flags);
>  }
> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
> index b3ed776..2e0e06d 100644
> --- a/include/linux/pm_domain.h
> +++ b/include/linux/pm_domain.h
> @@ -106,7 +106,7 @@ struct generic_pm_domain_data {
>  	struct notifier_block nb;
>  	struct mutex lock;
>  	unsigned int refcount;
> -	bool need_restore;
> +	int need_restore;
>  };
>  
>  #ifdef CONFIG_PM_GENERIC_DOMAINS
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.



More information about the linux-arm-kernel mailing list