[PATCH v3 0/5] Rework system pressure interface to the scheduler

Vincent Guittot vincent.guittot at linaro.org
Tue Jan 9 05:29:31 PST 2024


On Tue, 9 Jan 2024 at 12:34, Dietmar Eggemann <dietmar.eggemann at arm.com> wrote:
>
> On 08/01/2024 14:48, Vincent Guittot wrote:
> > Following the consolidation and cleanup of CPU capacity in [1], this serie
> > reworks how the scheduler gets the pressures on CPUs. We need to take into
> > account all pressures applied by cpufreq on the compute capacity of a CPU
> > for dozens of ms or more and not only cpufreq cooling device or HW
> > mitigiations. we split the pressure applied on CPU's capacity in 2 parts:
> > - one from cpufreq and freq_qos
> > - one from HW high freq mitigiation.
> >
> > The next step will be to add a dedicated interface for long standing
> > capping of the CPU capacity (i.e. for seconds or more) like the
> > scaling_max_freq of cpufreq sysfs. The latter is already taken into
> > account by this serie but as a temporary pressure which is not always the
> > best choice when we know that it will happen for seconds or more.
>
> I guess this is related to the 'user space system pressure' (*) slide of
> your OSPM '23 talk.

yes

>
> Where do you draw the line when it comes to time between (*) and the
> 'medium pace system pressure' (e.g. thermal and FREQ_QOS).

My goal is to consider the /sys/../scaling_max_freq as the 'user space
system pressure'

>
> IIRC, with (*) you want to rebuild the sched domains etc.

The easiest way would be to rebuild the sched_domain but the cost is
not small so I would prefer to skip the rebuild and add a new signal
that keep track on this capped capacity

>
> >
> > [1] https://lore.kernel.org/lkml/20231211104855.558096-1-vincent.guittot@linaro.org/
> >
> > Change since v1:
> > - Rework cpufreq_update_pressure()
> >
> > Change since v1:
> > - Use struct cpufreq_policy as parameter of cpufreq_update_pressure()
> > - Fix typos and comments
> > - Make sched_thermal_decay_shift boot param as deprecated
> >
> > Vincent Guittot (5):
> >   cpufreq: Add a cpufreq pressure feedback for the scheduler
> >   sched: Take cpufreq feedback into account
> >   thermal/cpufreq: Remove arch_update_thermal_pressure()
> >   sched: Rename arch_update_thermal_pressure into
> >     arch_update_hw_pressure
> >   sched/pelt: Remove shift of thermal clock
> >
> >  .../admin-guide/kernel-parameters.txt         |  1 +
> >  arch/arm/include/asm/topology.h               |  6 +-
> >  arch/arm64/include/asm/topology.h             |  6 +-
> >  drivers/base/arch_topology.c                  | 26 ++++----
> >  drivers/cpufreq/cpufreq.c                     | 36 +++++++++++
> >  drivers/cpufreq/qcom-cpufreq-hw.c             |  4 +-
> >  drivers/thermal/cpufreq_cooling.c             |  3 -
> >  include/linux/arch_topology.h                 |  8 +--
> >  include/linux/cpufreq.h                       | 10 +++
> >  include/linux/sched/topology.h                |  8 +--
> >  .../{thermal_pressure.h => hw_pressure.h}     | 14 ++---
> >  include/trace/events/sched.h                  |  2 +-
> >  init/Kconfig                                  | 12 ++--
> >  kernel/sched/core.c                           |  8 +--
> >  kernel/sched/fair.c                           | 63 +++++++++----------
> >  kernel/sched/pelt.c                           | 18 +++---
> >  kernel/sched/pelt.h                           | 16 ++---
> >  kernel/sched/sched.h                          | 22 +------
> >  18 files changed, 144 insertions(+), 119 deletions(-)
> >  rename include/trace/events/{thermal_pressure.h => hw_pressure.h} (55%)
>



More information about the linux-arm-kernel mailing list