[PATCH RFC 18/27] drivers: cpu-pd: Add PM Domain governor for CPUs
Marc Titinger
mtitinger at baylibre.com
Thu Nov 19 00:50:30 PST 2015
On 18/11/2015 19:42, Lorenzo Pieralisi wrote:
> On Tue, Nov 17, 2015 at 03:37:42PM -0700, Lina Iyer wrote:
>> A PM domain comprising of CPUs may be powered off when all the CPUs in
>> the domain are powered down. Powering down a CPU domain is generally a
>> expensive operation and therefore the power performance trade offs
>> should be considered. The time between the last CPU powering down and
>> the first CPU powering up in a domain, is the time available for the
>> domain to sleep. Ideally, the sleep time of the domain should fulfill
>> the residency requirement of the domains' idle state.
>>
>> To do this effectively, read the time before the wakeup of the cluster's
>> CPUs and ensure that the domain's idle state sleep time guarantees the
>> QoS requirements of each of the CPU, the PM QoS CPU_DMA_LATENCY and the
>> state's residency.
>
> To me this information should be part of the CPUidle governor (it is
> already there), we should not split the decision into multiple layers.
>
> The problem you are facing is that the CPUidle governor(s) do not take
> cross cpus relationship into account, I do not think that adding another
> decision layer in the power domain subsystem helps, you are doing that
> just because adding it to the existing CPUidle governor(s) is invasive.
>
> Why can't we use the power domain work you put together to eg disable
> idle states that share multiple cpus and make them "visible" only
> when the power domain that encompass them is actually going down ?
>
> You could use the power domains information to detect states that
> are shared between cpus.
>
> It is just an idea, what I am saying is that having another governor in
> the power domain subsytem does not make much sense, you split the
> decision in two layers while there is actually one, the existing
> CPUidle governor and that's where the decision should be taken.
>
> Thoughts appreciated.
Maybe this is silly and not thought-through, but I wonder if the
responsibilities could be split or instance with an outer control loop
that has the heuristic to compute the next tick time, and the required
cpu-power needed during that time slot, and an inner control loop
(genpd) that has a per-domain QoS and can optimize power consumption.
Marc.
>
> Lorenzo
>
>> Signed-off-by: Lina Iyer <lina.iyer at linaro.org>
>> ---
>> drivers/base/power/cpu-pd.c | 83 ++++++++++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 82 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/base/power/cpu-pd.c b/drivers/base/power/cpu-pd.c
>> index 617ce54..a00abc1 100644
>> --- a/drivers/base/power/cpu-pd.c
>> +++ b/drivers/base/power/cpu-pd.c
>> @@ -21,6 +21,7 @@
>> #include <linux/pm_qos.h>
>> #include <linux/rculist.h>
>> #include <linux/slab.h>
>> +#include <linux/tick.h>
>>
>> #define CPU_PD_NAME_MAX 36
>>
>> @@ -66,6 +67,86 @@ static void get_cpus_in_domain(struct generic_pm_domain *genpd,
>> }
>> }
>>
>> +static bool cpu_pd_down_ok(struct dev_pm_domain *pd)
>> +{
>> + struct generic_pm_domain *genpd = pd_to_genpd(pd);
>> + struct cpu_pm_domain *cpu_pd = to_cpu_pd(genpd);
>> + int qos = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
>> + u64 sleep_ns = ~0;
>> + ktime_t earliest;
>> + int cpu;
>> + int i;
>> +
>> + /* Reset the last set genpd state, default to index 0 */
>> + genpd->state_idx = 0;
>> +
>> + /* We dont want to power down, if QoS is 0 */
>> + if (!qos)
>> + return false;
>> +
>> + /*
>> + * Find the sleep time for the cluster.
>> + * The time between now and the first wake up of any CPU that
>> + * are in this domain hierarchy is the time available for the
>> + * domain to be idle.
>> + */
>> + earliest.tv64 = KTIME_MAX;
>> + for_each_cpu_and(cpu, cpu_pd->cpus, cpu_online_mask) {
>> + struct device *cpu_dev = get_cpu_device(cpu);
>> + struct gpd_timing_data *td;
>> +
>> + td = &dev_gpd_data(cpu_dev)->td;
>> +
>> + if (earliest.tv64 < td->next_wakeup.tv64)
>> + earliest = td->next_wakeup;
>> + }
>> +
>> + sleep_ns = ktime_to_ns(ktime_sub(earliest, ktime_get()));
>> + if (sleep_ns <= 0)
>> + return false;
>> +
>> + /*
>> + * Find the deepest sleep state that satisfies the residency
>> + * requirement and the QoS constraint
>> + */
>> + for (i = genpd->state_count - 1; i > 0; i--) {
>> + u64 state_sleep_ns;
>> +
>> + state_sleep_ns = genpd->states[i].power_off_latency_ns +
>> + genpd->states[i].power_on_latency_ns +
>> + genpd->states[i].residency_ns;
>> +
>> + /*
>> + * If we cant sleep to save power in the state, move on
>> + * to the next lower idle state.
>> + */
>> + if (state_sleep_ns > sleep_ns)
>> + continue;
>> +
>> + /*
>> + * We also dont want to sleep more than we should to
>> + * gaurantee QoS.
>> + */
>> + if (state_sleep_ns < (qos * NSEC_PER_USEC))
>> + break;
>> + }
>> +
>> + if (i >= 0)
>> + genpd->state_idx = i;
>> +
>> + return (i >= 0) ? true : false;
>> +}
>> +
>> +static bool cpu_stop_ok(struct device *dev)
>> +{
>> + return true;
>> +}
>> +
>> +struct dev_power_governor cpu_pd_gov = {
>> + .power_down_ok = cpu_pd_down_ok,
>> + .stop_ok = cpu_stop_ok,
>> +};
>> +
>> static int cpu_pd_power_off(struct generic_pm_domain *genpd)
>> {
>> struct cpu_pm_domain *pd = to_cpu_pd(genpd);
>> @@ -183,7 +264,7 @@ int of_register_cpu_pm_domain(struct device_node *dn,
>>
>> /* Register the CPU genpd */
>> pr_debug("adding %s as CPU PM domain.\n", pd->genpd->name);
>> - ret = of_pm_genpd_init(dn, pd->genpd, &simple_qos_governor, false);
>> + ret = of_pm_genpd_init(dn, pd->genpd, &cpu_pd_gov, false);
>> if (ret) {
>> pr_err("Unable to initialize domain %s\n", dn->full_name);
>> return ret;
>> --
>> 2.1.4
>>
More information about the linux-arm-kernel
mailing list