[PATCH] arm: vexpress: tc2: fix CPU hotplug and CPU idle race on cluster power down

Lorenzo Pieralisi lorenzo.pieralisi at arm.com
Fri Sep 27 11:27:52 EDT 2013


On Fri, Sep 27, 2013 at 04:10:49PM +0100, Dave Martin wrote:
> On Fri, Sep 27, 2013 at 03:29:13PM +0100, Lorenzo Pieralisi wrote:
> > On the TC2 testchip, when all CPUs in a cluster enter standbywfi and
> > commit a power down request, the power controller will wait for
> 
> [...]
> 
> (Minor nit: please wrap the commit message to a shorter length for git
> log.  I use 67, but I can't remember where that recommendation came from)

Ok, will do.

> [...]
> 
> > This patch moves the GIC CPU IF disable call in the TC2 MCPM implementation
> > from the suspend method to the power down method to fix the mentioned
> > race condition.
> 
> Reviewed-by: Dave Martin <Dave.Martin at arm.com>
> Tested-by: Dave Martn <Dave.Martin at arm.com> (for kexec)

Thanks !

> It's worth briefly summarising the kexec scenario too:
> 
> kexec hotplugs secondary CPUs out during kernel reload/restart.
> Because kexec may (deliberately) trash the old kernel text, it is
> not OK for CPUs to follow the MCPM soft reboot path, since
> instructions after the WFI may have been replaced by kexec.
> 
> If tc2_pm_down() does not disable the GIC cpu interface, there is a
> race between CPU powerdown in the old kernel and the IPI from the
> new kernel that triggers secondary boot, particluarly if the
> powerdown is slow (due to L2 cache cleaning for example).  If the
> new kernel wins the race, the affected CPU(s) will not really be
> reset and may execute garbage after the WFI.
> 
> (some comments below)
> 
> > 
> > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi at arm.com>
> > Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha at arm.com>
> > ---
> >  arch/arm/mach-vexpress/tc2_pm.c | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm/mach-vexpress/tc2_pm.c b/arch/arm/mach-vexpress/tc2_pm.c
> > index 7aeb5d6..e6eb481 100644
> > --- a/arch/arm/mach-vexpress/tc2_pm.c
> > +++ b/arch/arm/mach-vexpress/tc2_pm.c
> > @@ -131,6 +131,16 @@ static void tc2_pm_down(u64 residency)
> >  	} else
> >  		BUG();
> >  
> > +	/*
> > +	 * If the CPU is committed to power down, make sure
> > +	 * the power controller will be in charge of waking it
> > +	 * up upon IRQ, ie IRQ lines are cut from GIC CPU IF
> > +	 * to the CPU by disabling the GIC CPU IF to prevent wfi
> > +	 * from completing execution behind power controller back
> > +	 */
> > +	if (!skip_wfi)
> > +		gic_cpu_if_down();
> > +
> 
> In my version of this fix, I disabled the CPU interface much earlier,
> just after entering the function.  I think it probably works either
> way.  Do you think the location is critical, or should it not matter?
> 
> As far as I could tell, it matters only that this is done sometime
> between committing to WFI and actually doing it, but if you think
> there are extra requirements then I would like to understand them.

You are right, it can be added anywhere before executing wfi; I just added it
there for two reasons:

- MCPM checked if wfi should be skipped. If wfi should be skipped disabling
  GIC CPU IF is useless
- I want to issue the write to the GIC before we enter the no-fly zone
  (disable C-bit, clean cache, exit SMP (and disable CCI)). I know we
  could disable the GIC CPU IF right before executing wfi, but let's not
  play with fire, if that write hides a barrier we are in a bind.

Lorenzo




More information about the linux-arm-kernel mailing list