USB mass storage and ARM cache coherency

Catalin Marinas catalin.marinas at arm.com
Tue Mar 2 12:47:52 EST 2010


On Tue, 2010-03-02 at 17:05 +0000, Catalin Marinas wrote:
> On Tue, 2010-03-02 at 21:11 +0900, FUJITA Tomonori wrote:
> > On Sun, 28 Feb 2010 10:31:03 +0530
> > James Bottomley <James.Bottomley at HansenPartnership.com> wrote:
> > > But the point of all of this is that I cache invalidation doesn't appear
> > > anywhere in the I/O path ... so  if we're getting I/D incoherency,
> > > there's some problem in the mm code (or there's a missing arch
> > > assumption ... like I cache gets moved in more aggressively than we
> > > expect).  Parisc is very sensitive to I/D incoherency, so we'd notice if
> > > there were a serious generic problem here.
> >
> > I'm not sure that there are some problems in the mm or common code. Is
> > this ARM's implementation issue? (Of course, the usb stack and the
> > driver's misuse of the DMA API needs to be fixed too).
> 
> Just to summarise - on ARM (PIPT / non-aliasing VIPT) there is I-cache
> invalidation for user pages in update_mmu_cache() (it could actually be
> in set_pte_at on SMP to avoid a race but that's for another thread). The
> D-cache is flushed by this function only if the PG_arch_1 bit is set.
> This bit is set in the ARM case by flush_dcache_page(), following the
> advice in Documentation/cachetlb.txt.
> 
> With some drivers (those doing PIO) or subsystems (SCSI mass storage
> over USB HCD), there is no call to flush_dcache_page() for page cache
> pages, hence the ARM implementation of update_mmu_cache() doesn't flush
> the D-cache (and only invalidating the I-cache doesn't help).
> 
> The viable solutions so far:
> 
>      1. Implement a PIO mapping API similar to the DMA API which takes
>         care of the D-cache flushing. This means that PIO drivers would
>         need to be modified to use an API like pio_kmap()/pio_kunmap()
>         before writing to a page cache page.
>      2. Invert the meaning of PG_arch_1 to denote a clean page. This
>         means that by default newly allocated page cache pages are
>         considered dirty and even if there isn't a call to
>         flush_dcache_page(), update_mmu_cache() would flush the D-cache.
>         This is the PowerPC approach.
> 
> Option 2 above looks pretty appealing to me since it can be done in the
> ARM code exclusively. I've done some tests and it indeed solves the
> cache coherency with a rootfs on a USB stick. As Russell suggested, it
> can be optimised to mark a page as clean when the DMA API is involved to
> avoid duplicate flushing.

Actually, option 2 still has an issue - does not easily work on SMP
systems where cache maintenance operations aren't broadcast in hardware.
In this case (ARM11MPCore), flush_dcache_page() is implemented
non-lazily so that the flushing happens on the same processor that
dirtied the cache. But since with some drivers there is no call to this
function, it wouldn't make any difference.

A solution is to do something like read-for-ownership before flushing
the D-cache in update_mmu_cache() (or set_pte_at()).

-- 
Catalin




More information about the linux-arm-kernel mailing list