[PATCH 0/3] arm64: Fix cpuidle with pseudo-NMI enabled

Marc Zyngier maz at kernel.org
Fri Jun 11 04:32:32 PDT 2021


On Fri, 11 Jun 2021 10:41:34 +0100,
Lorenzo Pieralisi <lorenzo.pieralisi at arm.com> wrote:
> 
> On Fri, Jun 11, 2021 at 09:19:22AM +0100, Marc Zyngier wrote:
> > Hi Lorenzo,
> > 
> > On Thu, 10 Jun 2021 17:28:23 +0100,
> > Lorenzo Pieralisi <lorenzo.pieralisi at arm.com> wrote:
> > > 
> > > On Tue, Jun 08, 2021 at 06:27:12PM +0100, Marc Zyngier wrote:
> > > > It appears that although cpu_do_idle() is correctly dealing with the
> > > > PMR/DAIF duality, the PSCI cpu-suspend code has been left unaware of
> > > > it.
> > > > 
> > > > On a system that uses PSCI for idle (such as the Ampere Altra I have
> > > > access to), the kernel dies as soon as it enters idle (interrupts are
> > > > off at the GIC CPU interface level). Boo.
> > > 
> > > After investigating a bit I realised that this should depend on
> > > ICC_CTLR_EL3.PMHE - if that's clear the PMR should not affect the
> > > GICR->CPU IRQ forwarding (or WakeRequest signal generation when the
> > > GICR_WAKER.ProcessorSleep==1).
> > 
> > You lost me here. I don't see what PMHE has to do here. It is solely
> > used for 1:N distribution, and is the only way PMR does affect the
> > propagation of interrupts to the CPU interface. Fortunately, nobody
> > uses 1:N.
> > 
> > > IIUC if PMHE == 0, the PMR plays no role in wfi completion (and
> > > WakeSignal generation for a CPU/GICR in quiescent state).
> > 
> > Of course it does. PMR gates interrupts *before* they are signalled to
> > the CPU, meaning that if you keep interrupt masked at the PMR level,
> > you will never wake up from WFI. Or am I missing your point entirely?
> 
> For "simple" wfi (as in executing the wfi instruction) yes. The
> IRQs are forwarded to the CPU interface that filters the IRQs based
> on priorities and signal the I/F "pin" so that the core wakes up
> and wfi completes - forgive me my misunderstanding.

No worries. We're talking about the GIC architecture here, which has
the potential to confuse anyone! :D

> For deep sleep states where GICR_WAKER.ProcessorSleep == 1, the
> WakeSignal (ie CPU reset) is generated independently of the PMR
> value AFAIK. This means that even *if* an IRQ is supposed to be
> masked by the PMR it would wake up a sleeping core _anyway_.

I'm not sure we can draw this conclusion. It certainly isn't the
behaviour I'm observing. Otherwise, my system would be able to wake-up
without any additional hacks. It may wake-up the CPU interface, but
not the whole core. It is also completely possible that firmware will
use the PMR value as a hint to decide which interrupts can wake up the
CPU if it is so inclined.

> This behaviour is different from shallow C-states (and simple wfi).
> 
> That's why I asked what path is causing trouble in
> psci_cpu_suspend_enter().

It is the one where we don't loose context. Which makes sense, as it
would otherwise behave like a full reset, and we'd end-up with some
sane values.

> > > I assume on Ampere Altra PMHE == 1.
> > 
> > No, it is 0, as indicated by:
> > 
> > <quote>
> > [    0.000000] GICv3: Pseudo-NMIs enabled using relaxed ICC_PMR_EL1 synchronisation
> > </quote>
> > 
> > > This changes almost nothing to the need for this patchset but
> > > at least we clarify this behaviour.
> > > 
> > > Also, we should not be writing ICC_PMR_EL1 when
> > > GICR_WAKER.ProcessorSleep == 1 (which may be set in
> > > gic_cpu_pm_notifier()), this can hang the system.
> > 
> > Why? PMR defines what interrupts will be presented to the CPU
> > interface and trigger an exception. It doesn't affect putting the CPU
> > to sleep nor the wake-up.
> 
> I don't think we are allowed to have traffic between the CPU IF and
> the GICR when ProcessorSleep == 1. So, again IIUC, we can't write
> the PMR (if PMHE == 1) after putting the GICR in ProcessorSleep ==1 
> because this would sync with the GICR.

Hmmm. This restriction isn't obvious to me. There is some vague
threats in 10.1, but nothing concrete. A.4.11 is more explicit, but
doesn't really define what 'traffic' is. It is also tied to the stream
protocol, which isn't actually mandated.

But if it exists, PHME doesn't have any influence on PMR being sent
back to the RD. That can happen at any time (it is just that we
enforce it with a DSB in the case where PHME is set). Note that
do_cpu_idle() already does this, so we'd already be on thin ice. It is
also unclear whether 'traffic' occurs when the group enables are set
to 0, which is the case when we set ProcessorSleep=1.

And to be honest, setting ProcessorSleep=1 in the kernel (which
implies DS=1) has always been pretty odd. I don't know of any HW that
requires it. I'd be tempted to get rid of it for good.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list