ARM11MPcore: tlb_ops_need_broadcast causes deadlock

Will Deacon will.deacon at arm.com
Mon Mar 26 11:20:37 EDT 2012


Hmm, I somehow got dropped from this, but I'll pick it back up here.

On Sun, Mar 25, 2012 at 10:55:35PM +0100, Russell King - ARM Linux wrote:
> On Sun, Mar 25, 2012 at 10:22:49PM +0200, Peter Waechtler wrote:
> > And again I don't understand the abort handler: why do we get a page  
> > fault on
> > a young page then? grrh
> 
> Permissions?  Userspace trying to write to the page when it isn't marked
> writable and dirty?
> 
> >> Moreover, what about the case where we actually remove the page?
> > I don't claim that this is the only way to deadlock - but this is the  
> > case we encounter.
> 
> No, but you're arguing that we drop the TLB flush for your specific case.
> I'm telling you that's pointless if there's other cases as well which
> we'll deadlock.
> 
> But that's neither here nor there because you haven't fully explained
> what the problem is yet..

Yes, I'm inclined to agree with Russell here. Things don't add up and
without further information there's not a lot we can do.

Peter - are you able to reproduce and investigate this problem or was it a
one-off observation? If you can figure out what really goes on inside CPU B
in your example, then we may be able to look into this further. A good first
step might be to work out what triggers the initial data abort and look at
the state of the world at that point.

Will



More information about the linux-arm-kernel mailing list