USB mass storage and ARM cache coherency
James Bottomley
James.Bottomley at HansenPartnership.com
Thu Mar 4 09:21:52 EST 2010
On Thu, 2010-03-04 at 14:51 +0100, Pavel Machek wrote:
> > On Wed, 2010-03-03 at 21:54 +0000, Pavel Machek wrote:
> > > > With some drivers (those doing PIO) or subsystems (SCSI mass storage
> > > > over USB HCD), there is no call to flush_dcache_page() for page cache
> > > > pages, hence the ARM implementation of update_mmu_cache() doesn't flush
> > > > the D-cache (and only invalidating the I-cache doesn't help).
> > > >
> > > > The viable solutions so far:
> > > >
> > > > 1. Implement a PIO mapping API similar to the DMA API which takes
> > > > care of the D-cache flushing. This means that PIO drivers would
> > > > need to be modified to use an API like pio_kmap()/pio_kunmap()
> > > > before writing to a page cache page.
> > > > 2. Invert the meaning of PG_arch_1 to denote a clean page. This
> > > > means that by default newly allocated page cache pages are
> > > > considered dirty and even if there isn't a call to
> > > > flush_dcache_page(), update_mmu_cache() would flush the D-cache.
> > > > This is the PowerPC approach.
> > >
> > > What about option
> > >
> > > 3. Forget about PG_arch_1 and always do the flush?
> > >
> > > How big is the performance impact? Note that current code does not
> > > even *work* so working, 10% slower code will be an improvement.
> >
> > The driver fix is as simple as calling a flush_dcache_page() and I've
> > been carrying such patches in my tree for some time now. The question is
> > whether we need to do it in the driver or not (would need to update
> > Documentation/cachetlb.txt as well).
> >
> > The reason I'm not in favour always doing the flush is that we penalise
> > DMA drivers where there is no need for extra D-cache flushing (already
> > handled by the DMA API; option 1 above is similar, just that it is meant
> > for PIO usage). An ARM patch I proposed for inverting the meaning of
> > PG_arch_1 also marks a page as clean in the dma_map_* functions.
>
> But you are not fixing driver bug, are you?
Technically, he is. In the old days, most VI architectures were high
end enough not to require PIO transfers. The only exception was an IDE
driver used by sparc, which lead to the arch specific ide in/out string
instructions, in which sparc actually did all the necessary flushing.
So no other drivers than old IDE grew up with cache flushing in the PIO
case (and almost no high end VI hardware had an IDE interface, so they
rarely got implemented in the arch layer). However, recently, with the
transition from old IDE to libata and the prevalence of ARM with more
commodity hardware, the deficiency is becoming exposed. Even the PA8000
workstations now come with an IDE CD, which means we're starting to have
problems with them as well.
> Seems like ARM has requirement other architectures do not, that is
> a) not documented anywhere
> b) causes problems
>
> You could argue that performance improvement (how big is it, anyway?)
> is worth it, but this should be agreed to by wider community...
Performance is always worth it provided we don't sacrifice correctness.
The thing which was discovered in this thread is basically that ARM is
handling deferred flushing (for D/I coherency) in a slightly different
way from everyone else ... once that's fixed, ARM will likely not have
the D/I problem, but we'll still have the libata (and other PIO systems)
D flushing issue.
James
More information about the linux-arm-kernel
mailing list