shared memory problem on ARM v5TE using threads

Mon Dec 7 07:16:28 EST 2009

It's definitely this issue. 
If I disable L2 everything works.
Also using Russell debug prints shows that adjust_pte is never called for the write process page.

I think option 2 below is preferred, we don't want to flush the entire L2 for each context switch, it will be a performance killer.

Regards

-----Original Message-----
From: Russell King - ARM Linux [mailto:linux at arm.linux.org.uk] 
Sent: Monday, December 07, 2009 1:42 PM
To: saeed bishara
Cc: Ronen Shitrit; hs at denx.de; linux-arm-kernel at lists.infradead.org
Subject: Re: shared memory problem on ARM v5TE using threads

On Mon, Dec 07, 2009 at 01:31:41PM +0200, saeed bishara wrote:
> > I ran it on an ARM926EJ-S, which is ARMv5 and worked fine.
> >
> does it have L2 cache?

No.

> > If there's no problem with C=0 B=1 mappings on Kirkwood, I've no idea
> > what's going on, and I don't have any suggestion on what to try next.
> >
> > The log shows that the kernel is doing the right thing: when we detect
> > two mappings for the same page in the same MM space, we clean and
> > invalidate any existing cacheable mappings visible in the MM space
> > (both L1 and L2), and switch all visible mappings to C=0 B=1 mappings.
> > This makes the area non-cacheable.
>
> what about the PTE of the MM space of the write process? if it remains
> C=1 B=1, then it's data will be at the L2, and as the L2 is not
> flushed on context switch, then that explains this behavior.

That's probably the issue, and it means that _all_ shared writable
mappings on your processor will be broken.

Oh dear, that really is bad news.

There are two solutions to this which I can currently think of:
1. flush the L2 cache on every context switch
2. make all shared writable mappings non-cacheable

Neither of those two options appeals.  Since it's only one set of CPUs
which are affected, we really don't want to apply any fix for this to the
generic ARM kernel code - especially when all other L2 caches are sensibly
implemented as PIPT rather than VIVT.

Can we please forget that Feroceon CPUs exist? ;)