[PATCH v3 2/2] arm64: use WFE for long delays

Will Deacon will.deacon at arm.com
Thu Oct 12 01:52:13 PDT 2017


On Thu, Oct 12, 2017 at 09:47:26AM +0100, Julien Thierry wrote:
> Hi Will,
> 
> On 11/10/17 16:13, Will Deacon wrote:
> >Hi Julien,
> >
> >On Fri, Sep 29, 2017 at 11:52:30AM +0100, Julien Thierry wrote:
> >>The current delay implementation uses the yield instruction, which is a
> >>hint that it is beneficial to schedule another thread. As this is a hint,
> >>it may be implemented as a NOP, causing all delays to be busy loops. This
> >>is the case for many existing CPUs.
> >>
> >>Taking advantage of the generic timer sending periodic events to all
> >>cores, we can use WFE during delays to reduce power consumption. This is
> >>beneficial only for delays longer than the period of the timer event
> >>stream.
> >>
> >>If timer event stream is not enabled, delays will behave as yield/busy
> >>loops.
> >>
> >>Signed-off-by: Julien Thierry <julien.thierry at arm.com>
> >>Cc: Catalin Marinas <catalin.marinas at arm.com>
> >>Cc: Will Deacon <will.deacon at arm.com>
> >>Cc: Mark Rutland <mark.rutland at arm.com>
> >>---
> >>  arch/arm64/lib/delay.c               | 23 +++++++++++++++++++----
> >>  include/clocksource/arm_arch_timer.h |  4 +++-
> >>  2 files changed, 22 insertions(+), 5 deletions(-)
> >>
> >>diff --git a/arch/arm64/lib/delay.c b/arch/arm64/lib/delay.c
> >>index dad4ec9..4dc27f3 100644
> >>--- a/arch/arm64/lib/delay.c
> >>+++ b/arch/arm64/lib/delay.c
> >>@@ -24,10 +24,28 @@
> >>  #include <linux/module.h>
> >>  #include <linux/timex.h>
> >>
> >>+#include <clocksource/arm_arch_timer.h>
> >>+
> >>+#define USECS_TO_CYCLES(TIME_USECS)			\
> >>+	xloops_to_cycles((TIME_USECS) * 0x10C7UL)
> >
> >The macro parameter can be lower-case here.
> >
> 
> Noted, I'll change it.
> 
> >>+static inline unsigned long xloops_to_cycles(unsigned long xloops)
> >>+{
> >>+	return (xloops * loops_per_jiffy * HZ) >> 32;
> >>+}
> >>+
> >>  void __delay(unsigned long cycles)
> >>  {
> >>  	cycles_t start = get_cycles();
> >>
> >>+	if (arch_timer_evtstrm_available()) {
> >
> >Hmm, is this never called in a context where preemption is enabled?
> >Maybe arch_timer_evtstrm_available should be using raw_smp_processor_id()
> >under the hood.
> >
> 
> This can be called from a preemptible context. But when it is, the event
> stream is either enabled both on the preemptible context and on the context
> where a preempted context can be resumed, or the event stream is just
> disabled in the whole system.
> 
> Does using raw_smp_processor_id solve an issue here?

I thought that DEBUG_PREEMPT would splat if you called smp_processor_id()
from preemptible context?

Will



More information about the linux-arm-kernel mailing list