Kernel related (?) user space crash at ARM11 MPCore

Russell King - ARM Linux linux at arm.linux.org.uk
Sun Oct 25 09:39:18 EDT 2009


On Tue, Oct 20, 2009 at 12:39:08PM +0100, Catalin Marinas wrote:
> On Thu, 2009-10-15 at 16:56 +0100, Catalin Marinas wrote:
> > On Thu, 2009-10-15 at 16:28 +0100, Russell King - ARM Linux wrote:
> > > On Thu, Oct 15, 2009 at 04:20:22PM +0100, Catalin Marinas wrote:
> > > > On Thu, 2009-10-15 at 15:57 +0100, Russell King - ARM Linux wrote:
> > > > > On Mon, Sep 21, 2009 at 11:07:51AM +0100, Russell King - ARM Linux wrote:
> > > > > > On Mon, Sep 21, 2009 at 10:44:23AM +0100, Catalin Marinas wrote:
> > > > > > > We would need to fix this somehow as well. We currently handle the
> > > > > > > I-cache in update_mmu_cache() when a page is first mapped if it has
> > > > > > > VM_EXEC set.
> > > > > > 
> > > > > > The reason I'm pushing you hard to separate the two issues is that the
> > > > > > two should be treated separately.  I think we need to consider ensuring
> > > > > > that freed pages do not have any I-cache lines associated with them,
> > > > > > rather than waiting for them to be allocated and then dealing with the
> > > > > > I-cache problem.
> > > > > 
> > > > > Having now benchmarked this (making flush_cache_* always invalidate
> > > > > the I-cache, so free'd pages are I-cache clean), and to me, the results
> > > > > quite look promising - please try out this patch.
> [...]
> > > > Before trying the patch, I don't entirely agree with the approach. You
> > > > will get speculative fetches in the I-cache via the kernel linear
> > > > mapping (where NX is always cleared) on newer processors and may end up
> > > > with random faults in user space (not that likely but not impossible
> > > > either).
> > > 
> > > That means we have no option but to flush the I-cache every time a page
> > > is placed into userspace - we might as well make update_mmu_cache
> > > unconditionally flush the I-cache every time its called.
> [...]
> > We can flush the D-cache in copy_user_page(), maybe lazily via
> > flush_dcache_page() and invalidate the I-cache in update_mmu_cache() if
> > PG_arch_1 (ignoring VM_EXEC).
> 
> Something like below (based on your original suggestion for flushing the
> D-cache in copy_user_highpage).
> 
> BTW, the cache flushing code in Linux can be optimised a bit more on
> VIPT caches:
> 
>       * __cpuc_flush_dcache_page() could cope with just D-cache clean
>         rather than clean+invalidate

No it can not - that breaks shared mappings.  The problem is that
flush_dcache_page() is used in two circumstances.  These are described
in more detail in cachetlb.txt, but briefly:

1. After the kernel writes to its mapping for a page cache page, and needs
   to ensure that those writes are visible to shared mmap()s in userspace.

2. Before the kernel reads from its mapping for a page cache page, and
   needs to ensure that it reads up to date data written by userspace
   into those mappings.

So just cleaning the D-cache means that (2) will return stale data.

>       * whole I-cache invalidation was needed for some ARM1136 erratum.
>         We can conditionally revert it to invalidating a range

That's not what the commit (826cbda) says which implemented it.  Also,
since we have broken I-cache flushes even with the erratum enabled,
let's fix the work-around and re-evaluate the situation before changing
anything.

It could be that some of the I-cache problems are caused by the improperly
fixed erratum.

> Flush the D-cache during copy_user_highpage()
> 
> From: Catalin Marinas <catalin.marinas at arm.com>
> 
> The I and D caches for copy-on-write pages on processors with
> write-allocate caches become incoherent causing problems on application
> relying on CoW for text pages (dynamic linker relocating symbols in a
> text page). This patch flushes the D-cache for such pages (possibly
> lazily via update_mmu_cache which also takes care of the I-cache).

Actually, I think this is caused by a missing I-cache flush in
flush_cache_range().  Adding that flush should resolve this problem
in hand (and make VIPT aliasing and VIPT non-aliasing behave in the
same manner.)  That's something which my patch previously posted in
this thread did.

Note also that with ASID tagged VIVT I-cache, we are missing out
on cache flushing.  As you've identified, it's entirely possible
for text page translations to be changed, and according to B3.4.1
bullet 2, a flush is required.



More information about the linux-arm-kernel mailing list