USB mass storage and ARM cache coherency
Catalin Marinas
catalin.marinas at arm.com
Tue Mar 2 12:05:27 EST 2010
On Tue, 2010-03-02 at 21:11 +0900, FUJITA Tomonori wrote:
> On Sun, 28 Feb 2010 10:31:03 +0530
> James Bottomley <James.Bottomley at HansenPartnership.com> wrote:
> > But the point of all of this is that I cache invalidation doesn't appear
> > anywhere in the I/O path ... so if we're getting I/D incoherency,
> > there's some problem in the mm code (or there's a missing arch
> > assumption ... like I cache gets moved in more aggressively than we
> > expect). Parisc is very sensitive to I/D incoherency, so we'd notice if
> > there were a serious generic problem here.
>
> I'm not sure that there are some problems in the mm or common code. Is
> this ARM's implementation issue? (Of course, the usb stack and the
> driver's misuse of the DMA API needs to be fixed too).
Just to summarise - on ARM (PIPT / non-aliasing VIPT) there is I-cache
invalidation for user pages in update_mmu_cache() (it could actually be
in set_pte_at on SMP to avoid a race but that's for another thread). The
D-cache is flushed by this function only if the PG_arch_1 bit is set.
This bit is set in the ARM case by flush_dcache_page(), following the
advice in Documentation/cachetlb.txt.
With some drivers (those doing PIO) or subsystems (SCSI mass storage
over USB HCD), there is no call to flush_dcache_page() for page cache
pages, hence the ARM implementation of update_mmu_cache() doesn't flush
the D-cache (and only invalidating the I-cache doesn't help).
The viable solutions so far:
1. Implement a PIO mapping API similar to the DMA API which takes
care of the D-cache flushing. This means that PIO drivers would
need to be modified to use an API like pio_kmap()/pio_kunmap()
before writing to a page cache page.
2. Invert the meaning of PG_arch_1 to denote a clean page. This
means that by default newly allocated page cache pages are
considered dirty and even if there isn't a call to
flush_dcache_page(), update_mmu_cache() would flush the D-cache.
This is the PowerPC approach.
Option 2 above looks pretty appealing to me since it can be done in the
ARM code exclusively. I've done some tests and it indeed solves the
cache coherency with a rootfs on a USB stick. As Russell suggested, it
can be optimised to mark a page as clean when the DMA API is involved to
avoid duplicate flushing.
It was also suggested to add a PG_arch_2 flag which would keep track of
the I-cache status as well.
I can post a proposal to modify the cachetlb.txt document to reflect the
issues we currently have on ARM.
--
Catalin
More information about the linux-arm-kernel
mailing list