[PATCH RFC 18/27] drivers: cpu-pd: Add PM Domain governor for CPUs
Lina Iyer
lina.iyer at linaro.org
Fri Nov 20 09:39:04 PST 2015
On Thu, Nov 19 2015 at 01:50 -0700, Marc Titinger wrote:
>On 18/11/2015 19:42, Lorenzo Pieralisi wrote:
>>On Tue, Nov 17, 2015 at 03:37:42PM -0700, Lina Iyer wrote:
>>>A PM domain comprising of CPUs may be powered off when all the CPUs in
>>>the domain are powered down. Powering down a CPU domain is generally a
>>>expensive operation and therefore the power performance trade offs
>>>should be considered. The time between the last CPU powering down and
>>>the first CPU powering up in a domain, is the time available for the
>>>domain to sleep. Ideally, the sleep time of the domain should fulfill
>>>the residency requirement of the domains' idle state.
>>>
>>>To do this effectively, read the time before the wakeup of the cluster's
>>>CPUs and ensure that the domain's idle state sleep time guarantees the
>>>QoS requirements of each of the CPU, the PM QoS CPU_DMA_LATENCY and the
>>>state's residency.
>>
>>To me this information should be part of the CPUidle governor (it is
>>already there), we should not split the decision into multiple layers.
>>
>>The problem you are facing is that the CPUidle governor(s) do not take
>>cross cpus relationship into account, I do not think that adding another
>>decision layer in the power domain subsystem helps, you are doing that
>>just because adding it to the existing CPUidle governor(s) is invasive.
>>
>>Why can't we use the power domain work you put together to eg disable
>>idle states that share multiple cpus and make them "visible" only
>>when the power domain that encompass them is actually going down ?
>>
>>You could use the power domains information to detect states that
>>are shared between cpus.
>>
>>It is just an idea, what I am saying is that having another governor in
>>the power domain subsytem does not make much sense, you split the
>>decision in two layers while there is actually one, the existing
>>CPUidle governor and that's where the decision should be taken.
>>
>>Thoughts appreciated.
>
>Maybe this is silly and not thought-through, but I wonder if the
>responsibilities could be split or instance with an outer control loop
>that has the heuristic to compute the next tick time, and the required
>cpu-power needed during that time slot, and an inner control loop
>(genpd) that has a per-domain QoS and can optimize power consumption.
>
Not sure I understand everything you said, but the heuristics across a
bunch of CPUs can be very erratic. Its hard enough for menu governor to
determine heuristics on a per-cpu basis.
I governor in this patch already takes care of PM QoS, but does not do a
per-cpu QoS.
We should discuss this more.
-- Lina
>Marc.
>
>>
>>Lorenzo
>>
>>>Signed-off-by: Lina Iyer <lina.iyer at linaro.org>
>>>---
>>> drivers/base/power/cpu-pd.c | 83 ++++++++++++++++++++++++++++++++++++++++++++-
>>> 1 file changed, 82 insertions(+), 1 deletion(-)
>>>
>>>diff --git a/drivers/base/power/cpu-pd.c b/drivers/base/power/cpu-pd.c
>>>index 617ce54..a00abc1 100644
>>>--- a/drivers/base/power/cpu-pd.c
>>>+++ b/drivers/base/power/cpu-pd.c
>>>@@ -21,6 +21,7 @@
>>> #include <linux/pm_qos.h>
>>> #include <linux/rculist.h>
>>> #include <linux/slab.h>
>>>+#include <linux/tick.h>
>>>
>>> #define CPU_PD_NAME_MAX 36
>>>
>>>@@ -66,6 +67,86 @@ static void get_cpus_in_domain(struct generic_pm_domain *genpd,
>>> }
>>> }
>>>
>>>+static bool cpu_pd_down_ok(struct dev_pm_domain *pd)
>>>+{
>>>+ struct generic_pm_domain *genpd = pd_to_genpd(pd);
>>>+ struct cpu_pm_domain *cpu_pd = to_cpu_pd(genpd);
>>>+ int qos = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
>>>+ u64 sleep_ns = ~0;
>>>+ ktime_t earliest;
>>>+ int cpu;
>>>+ int i;
>>>+
>>>+ /* Reset the last set genpd state, default to index 0 */
>>>+ genpd->state_idx = 0;
>>>+
>>>+ /* We dont want to power down, if QoS is 0 */
>>>+ if (!qos)
>>>+ return false;
>>>+
>>>+ /*
>>>+ * Find the sleep time for the cluster.
>>>+ * The time between now and the first wake up of any CPU that
>>>+ * are in this domain hierarchy is the time available for the
>>>+ * domain to be idle.
>>>+ */
>>>+ earliest.tv64 = KTIME_MAX;
>>>+ for_each_cpu_and(cpu, cpu_pd->cpus, cpu_online_mask) {
>>>+ struct device *cpu_dev = get_cpu_device(cpu);
>>>+ struct gpd_timing_data *td;
>>>+
>>>+ td = &dev_gpd_data(cpu_dev)->td;
>>>+
>>>+ if (earliest.tv64 < td->next_wakeup.tv64)
>>>+ earliest = td->next_wakeup;
>>>+ }
>>>+
>>>+ sleep_ns = ktime_to_ns(ktime_sub(earliest, ktime_get()));
>>>+ if (sleep_ns <= 0)
>>>+ return false;
>>>+
>>>+ /*
>>>+ * Find the deepest sleep state that satisfies the residency
>>>+ * requirement and the QoS constraint
>>>+ */
>>>+ for (i = genpd->state_count - 1; i > 0; i--) {
>>>+ u64 state_sleep_ns;
>>>+
>>>+ state_sleep_ns = genpd->states[i].power_off_latency_ns +
>>>+ genpd->states[i].power_on_latency_ns +
>>>+ genpd->states[i].residency_ns;
>>>+
>>>+ /*
>>>+ * If we cant sleep to save power in the state, move on
>>>+ * to the next lower idle state.
>>>+ */
>>>+ if (state_sleep_ns > sleep_ns)
>>>+ continue;
>>>+
>>>+ /*
>>>+ * We also dont want to sleep more than we should to
>>>+ * gaurantee QoS.
>>>+ */
>>>+ if (state_sleep_ns < (qos * NSEC_PER_USEC))
>>>+ break;
>>>+ }
>>>+
>>>+ if (i >= 0)
>>>+ genpd->state_idx = i;
>>>+
>>>+ return (i >= 0) ? true : false;
>>>+}
>>>+
>>>+static bool cpu_stop_ok(struct device *dev)
>>>+{
>>>+ return true;
>>>+}
>>>+
>>>+struct dev_power_governor cpu_pd_gov = {
>>>+ .power_down_ok = cpu_pd_down_ok,
>>>+ .stop_ok = cpu_stop_ok,
>>>+};
>>>+
>>> static int cpu_pd_power_off(struct generic_pm_domain *genpd)
>>> {
>>> struct cpu_pm_domain *pd = to_cpu_pd(genpd);
>>>@@ -183,7 +264,7 @@ int of_register_cpu_pm_domain(struct device_node *dn,
>>>
>>> /* Register the CPU genpd */
>>> pr_debug("adding %s as CPU PM domain.\n", pd->genpd->name);
>>>- ret = of_pm_genpd_init(dn, pd->genpd, &simple_qos_governor, false);
>>>+ ret = of_pm_genpd_init(dn, pd->genpd, &cpu_pd_gov, false);
>>> if (ret) {
>>> pr_err("Unable to initialize domain %s\n", dn->full_name);
>>> return ret;
>>>--
>>>2.1.4
>>>
>
More information about the linux-arm-kernel
mailing list