[PATCH] ARM: mm: dma: Update coherent streaming apis with missing memory barrier

Catalin Marinas catalin.marinas at arm.com
Thu Apr 24 02:09:27 PDT 2014


On Wed, Apr 23, 2014 at 06:17:27PM +0100, Will Deacon wrote:
> On Wed, Apr 23, 2014 at 05:02:16PM +0100, Catalin Marinas wrote:
> > On Wed, Apr 23, 2014 at 10:02:51AM +0100, Will Deacon wrote:
> > > On Tue, Apr 22, 2014 at 09:30:27PM +0100, Santosh Shilimkar wrote:
> > > > writel() or an explcit barrier in the driver will do the job. I was
> > > > just thinking that we are trying to work around the short comings
> > > > of streaming API by adding barriers in the driver. For example
> > > > on a non-coherent system, i don't need that barrier because
> > > > dma_ops does take care of that.
> > > 
> > > I wonder whether we can remove those barriers altogether then (from the DMA
> > > cache operations). For the coherent case, the driver must provide the
> > > barrier (probably via writel) so the non-coherent case shouldn't be any
> > > different.
> > 
> > For the DMA_TO_DEVICE case the effect should be the same as wmb()
> > implies dsb (and outer_sync() for write). But the reason we have
> > barriers in the DMA ops is slightly different - the completion of the
> > cache maintenance operation rather than ordering with any previous
> > writes to the DMA buffer.
> > 
> > In the DMA_FROM_DEVICE scenario for example, the CPU gets an interrupt
> > for a finished DMA transfer and executes dma_unmap_single() prior to
> > accessing the page. However the CPU access after unmapping is done using
> > normal LDR/STR which do not imply any barrier. So we need to ensure the
> > completion of the cache invalidation in the dma operation.
> 
> I don't think we necessarily need completion, we just need ordering. That
> is, the normal LDR/STR instructions must be observed after the cache
> maintenance. I'll have to revisit the ARM ARM to be sure of this, but a dmb
> should be sufficient for that guarantee.

If we only do D-cache maintenance by MVA, the ARM ARM (both v7 and v8)
claims that these are ordered relative to any explicit load/stores to
the same address. So in theory we don't even need a DMB for unmapping
with DMA_FROM_DEVICE. But in practice, we may have the outer cache,
hence a DSB is required before the outer_sync() (we could move it there
though).

-- 
Catalin



More information about the linux-arm-kernel mailing list