[PATCH 8/9] arm64: support cpuidle-haltpoll

Ankur Arora ankur.a.arora at oracle.com
Tue Jun 4 16:09:06 PDT 2024


Okanovic, Haris <harisokn at amazon.com> writes:

> On Tue, 2024-04-30 at 11:37 -0700, Ankur Arora wrote:
>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>
>>
>>
>> Add architectural support for the cpuidle-haltpoll driver by defining
>> arch_haltpoll_*(). Also select ARCH_HAS_OPTIMIZED_POLL since we have
>> an optimized polling mechanism via smp_cond_load*().
>>
>> Add the configuration option, ARCH_CPUIDLE_HALTPOLL to allow
>> cpuidle-haltpoll to be selected.
>>
>> Note that we limit cpuidle-haltpoll support to when the event-stream is
>> available. This is necessary because polling via smp_cond_load_relaxed()
>> uses WFE to wait for a store which might not happen for an prolonged
>> period of time. So, ensure the event-stream is around to provide a
>> terminating condition.
>>
>> Signed-off-by: Ankur Arora <ankur.a.arora at oracle.com>
>> ---
>>  arch/arm64/Kconfig                        | 10 ++++++++++
>>  arch/arm64/include/asm/cpuidle_haltpoll.h | 21 +++++++++++++++++++++
>>  2 files changed, 31 insertions(+)
>>  create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 7b11c98b3e84..6f2df162b10e 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -34,6 +34,7 @@ config ARM64
>>         select ARCH_HAS_MEMBARRIER_SYNC_CORE
>>         select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
>>         select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
>> +       select ARCH_HAS_OPTIMIZED_POLL
>>         select ARCH_HAS_PTE_DEVMAP
>>         select ARCH_HAS_PTE_SPECIAL
>>         select ARCH_HAS_HW_PTE_YOUNG
>> @@ -2331,6 +2332,15 @@ config ARCH_HIBERNATION_HEADER
>>  config ARCH_SUSPEND_POSSIBLE
>>         def_bool y
>>
>> +config ARCH_CPUIDLE_HALTPOLL
>> +       bool "Enable selection of the cpuidle-haltpoll driver"
>> +       default n
>> +       help
>> +         cpuidle-haltpoll allows for adaptive polling based on
>> +         current load before entering the idle state.
>> +
>> +         Some virtualized workloads benefit from using it.
>> +
>>  endmenu # "Power management options"
>>
>>  menu "CPU Power Management"
>> diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
>> new file mode 100644
>> index 000000000000..a79bdec7f516
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
>> @@ -0,0 +1,21 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_HALTPOLL_H
>> +#define _ASM_HALTPOLL_H
>> +
>> +static inline void arch_haltpoll_enable(unsigned int cpu)
>> +{
>> +}
>> +
>> +static inline void arch_haltpoll_disable(unsigned int cpu)
>> +{
>> +}
>> +
>> +static inline bool arch_haltpoll_supported(void)
>> +{
>> +       /*
>> +        * Ensure the event stream is available to provide a terminating
>> +        * condition to the WFE in the poll loop.
>> +        */
>> +       return arch_timer_evtstrm_available();
>
> Note this fails build when CONFIG_HALTPOLL_CPUIDLE=m (module):
>
> ERROR: modpost: "arch_cpu_idle" [drivers/cpuidle/cpuidle-haltpoll.ko]
> undefined!
> ERROR: modpost: "arch_timer_evtstrm_available"
> [drivers/cpuidle/cpuidle-haltpoll.ko] undefined!
> make[2]: *** [scripts/Makefile.modpost:145: Module.symvers] Error 1
> make[1]: *** [/home/ubuntu/linux/Makefile:1886: modpost] Error 2
> make: *** [Makefile:240: __sub-make] Error 2

Thanks for trying it out. Missed that.

> You could add EXPORT_SYMBOL_*()'s on the above helpers or restrict
> HALTPOLL_CPUIDLE module to built-in (remove "tristate" Kconfig).

Yeah AFAICT this is the only cpuidle driver providing the module
option. Unfortunately can't remove the tristate thing. People might
already be using it as a module on x86.

I think the arch_cpu_idle() makes sense to export. For
arch_timer_evtstrm_available(), eventually the arch_haltpoll_*()
in any case need to move out of a header file. I'll just do that
now.

> Otherwise, everything worked for me when built-in (=y) atop 6.10.0
> (4a4be1a). I see similar performance gains in `perf bench` on AWS
> Graviton3 c7g.16xlarge.

Excellent. Thanks for checking.

--
ankur



More information about the linux-arm-kernel mailing list