v3.13-rc6+ regression (ARM board)

John Stultz john.stultz at linaro.org
Thu Jan 2 16:34:19 EST 2014


On 01/02/2014 12:43 PM, Linus Torvalds wrote:
> On Thu, Jan 2, 2014 at 12:30 PM, John Stultz <john.stultz at linaro.org> wrote:
>> So something else may be at play. Even with Linus' patch I reproduced a
>> similar hang here.
>>
>> Still chasing it down, but it looks like a seqlock deadlock where we're
>> calling read while holding the lock.
> Hmm. Only with lockdep, right?

Yep.

> Does lockdep perhaps read the scheduler clock? Afaik, we have
> lockstat_clock(), which uses local_clock(), which in turn translates
> to sched_clock_cpu(smp_processor_id())..
>
> So if that code now tries to read the scheduler clock when
> update_sched_clock() is doing a update and has done a
> write_seqcount_begin()...

Sigh. Deadlock by deadlock detection code.

So yea, it looks like this is the case.. though I've not been able to
get a backtrace during the hang to totally validate it (I'm just using
qemu's info registers and looking at the pc and lr).


So I'm guessing we'll just have to disable the lockdep logic here, which
is a little sad, since I'm a little nervous about the generic
sched_clock's locking (ie: works ok for ARM, but its not NMI safe), and
having some better debugging tools there would be helpful.


Anyway, I'll send out a patch to disable the lockdep usage here shortly.

thanks
-john











More information about the linux-arm-kernel mailing list