Shutdown problem in SMP system happened on Tegra20

Stephen Warren swarren at wwwdotorg.org
Fri Aug 24 11:19:09 EDT 2012


On 08/24/2012 02:23 AM, Bill Huang wrote:
> Hi,
> 

[Explicitly CCing a few people, and dropping the linux-kernel mailing
list since the ARM list is most useful]

> When doing shutdown on Tegra20/Tegra30, we need to read/write PMIC registers through I2C
> to perform the power off sequence. Unfortunately, sometimes we'll fail to shutdown
> due to I2C timeout on Tegra20. And the cause of the timeout is due to the CPU which I2C
> controller IRQ affined to will have chance to be offlined without migrating all irqs affined 
> to it, so the following I2C transactions will fail (no any CPU will handle that interrupt
> since then).
> 
> Some snippet of the shutdown codes:
> 
> void kernel_power_off(void)
> {
> 	kernel_shutdown_prepare(SYSTEM_POWER_OFF);
> 	:
> 	disable_nonboot_cpus();
> 	:
> 	machine_power_off();
> }
> 
> void machine_power_off(void)
> {
> 	machine_shutdown();
> 	if (pm_power_off)
> 		pm_power_off(); /* this is where we send I2C write to shutdown */
> }
> 
> void machine_shutdown(void)
> {
> #ifdef CONFIG_SMP
> 	smp_send_stop();
> #endif
> }
> 
> In "smp_send_stop()", it will send "IPI_CPU_STOPS" to offline other cpus except
> current cpu (smp_processor_id()), however, current cpu will not always be cpu0 at
> least at Tegra20, that said for example cpu1 might be the current cpu and cpu0 will
> be offlined and this is the case why the I2C transaction will timeout. 
> 
> For normal case, "disable_nonboot_cpus()" call will disable all other Cpus except
> cpu0, that means we won't hit the problem mentioned here since cpu0 will always be
> the current cpu in the call "smp_send_stop", but the call to "disable_nonboot_cpus" 
> will happen only when "CONFIG_PM_SLEEP_SMP" is enabled which is not the case for
> Tegra20/Tegra30, we don't support suspend yet so this can't be enabled.
> 
> There are two known fix for this, the first one is enable suspend (ARCH_SUSPEND_POSSIBLE)
> so the cpu0 will be the only online cpu while doing "machine_shutdown". The second
> fix is adding call to "migrate_irqs()" in "ipi_cpu_stop" so all irqs can be migrated to
> the active cpu.
> 
> Could someone familiar with the ARM SMP design help to answer my two questions?
> 
> 1. Does it make sense that "smp_processor_id()" could be non-cpu0 in the call
>    "smp_send_stop()"? In Tegra30 it will always be cpu0 but Tegra20 will be 50-50,
>    I just can't find the magic.
> 
> 2. If current cpu is not necessarily be cpu0 in the call "smp_send_stop()", then
>    does it make sense to add "migrate_irqs()" in "ipi_cpu_stop()"? Or is there any
>    other fix which makes more sense?
> 
> Thanks,
> Bill




More information about the linux-arm-kernel mailing list