omap4-panda-es boot issues with v3.15-rc4

Grygorii Strashko grygorii.strashko at ti.com
Thu May 8 10:12:21 PDT 2014


Hi,

On 05/08/2014 06:40 PM, Kevin Hilman wrote:
> On Thu, May 8, 2014 at 8:31 AM, Kevin Hilman <khilman at linaro.org> wrote:
>> Roger Quadros <rogerq at ti.com> writes:
>>
>>> Hi,
>>>
>>> Nishant pointed me to a booting issue with omap4-panda-es on linux-next but I'm observing
>>> similar issues, although less frequent, with v3.15-rc4 as well.
>>>
>>> Configuration:
>>>
>>> - kernel v3.15-rc4 or linux-next (20140507)
>>> - multi_v7_defconfig with LEDS_TRIGGER_HEARTBEAT and LEDS_GPIO enabled
>>> - u-boot/master       173d294b94cf
>>>
>>> Observations:
>>>
>>> - Out of 10 boots a few may not succeed and hang midway without any warnings. Heartbeat LED stops.
>>> e.g. http://www.hastebin.com/ebumojegoq.vhdl
>>>
>>> - Hang more noticeable on linux-next (20140507) than on v3.15-rc4
>>
>> I've beeen noticing the same thing for awhile with my boot tests.  For
>> me, next-20140508 is failing most of the time now.
>>
>>> - Hang more noticeable with USB_EHCI_HCD enabled but hang observed even without USB_EHCI_HCD.
>>> Maybe related to when high speed interrupts occur in the boot process.
>>>
>>> - On successful boots following warning is seen
>>> [    4.010375] gic_timer_retrigger: lost localtimer interrupt
>>>
>>> - On successful boots heartbeat LED stops blinking after boot process and left idle. LED can remain stuck in
>>> ON state as well. It does blink again when doing activity on console.
>>>
>>> Workaround:
>>>
>>> - Disabling CPU_IDLE or even just disabling C3 (MPU OSWR) seems to fix all the above issues.
>>>
>>> I don't really know what exactly is the issue but it seems to be specific to OMAP4, GIC, MPU OSWR.
>>
>> I can confirm that disabling CONFIG_CPU_IDLE seems to make the problem
>> go away.  Hmm....
> 
> Another finger pointing in the same direction: omap2plus_defconfig +
> CONFIG_CPU_IDLE=y also fails to boot rather consistently in today's
> -next.

Is it observed on OMAP4460 only?
if no - it's smth new.
if yes - may be some racing condition is still present.

Roger, is it possible to connect debugger and check GIC distributor status
(gic_dist_base_addr + GIC_DIST_CTRL) in case of failure?

According to the current code (OMAP4460) it's possible that CPU0 will stuck only in case
if CPU1 is kicked off from PWRDM_POWER_OFF state somehow but not by CPU0. 
Code assumes that CPU1 can exit from PWRDM_POWER_OFF state only when CPU0 calls clkdm_wakeup(cpu_clkdm[1]); 

Sorry, but I'm not able to debug it now.

Regards,
-grygorii



More information about the linux-arm-kernel mailing list