[Linaro-mm-sig] [RFC] ARM DMA mapping TODO, v1

Russell King - ARM Linux linux at arm.linux.org.uk
Thu Apr 28 06:51:31 EDT 2011


On Thu, Apr 28, 2011 at 12:32:32PM +0200, Marek Szyprowski wrote:
> On Thursday, April 28, 2011 11:38 AM Russell King - ARM Linux wrote:
> > > > > 2. Implement dma_alloc_noncoherent on ARM. Marek pointed out
> > > > >    that this is needed, and it currently is not implemented, with
> > > > >    an outdated comment explaining why it used to not be possible
> > > > >    to do it.
> > > >
> > > > dma_alloc_noncoherent is an entirely pointless API afaics.
> > >
> > > I was about to ask what the point is ... (what is the expected
> > > semantic ? Memory that is reachable but not necessarily cache
> > > coherent ?)
> > 
> > As far as I can see, dma_alloc_noncoherent() should just be a wrapper
> > around the normal page allocation function.  I don't see it ever needing
> > to do anything special - and the advantage of just being the normal
> > page allocation function is that its properties are well known and
> > architecture independent.
> 
> If there is IOMMU chip that supports pages larger than 4KiB then
> dma_alloc_noncoherent() might try to allocate such larger pages what will
> result in faster access to the buffer (lower iommu tlb miss ratio).
> For large buffers even 64KiB 'pages' gives a significant performance
> improvement.

The memory allocated by dma_alloc_noncoherent() (and dma_alloc_coherent())
has to be virtually contiguous, and DMA contiguous.  It is assumed by all
drivers that:

	virt = dma_alloc_foo(size, &dma);

	cpuaddr = virt + offset;
	dmaaddr = dma + offset;

results in the CPU and DMA seeing ultimately the same address for cpuaddr
and dmaaddr for 0 <= offset < size.

The standard alloc_pages() also ensures that if you ask for an order-N
page, you'll end up with that allocation being contiguous - so there's
no difference there.

What I'd suggest is that dma_alloc_noncoherent() should be architecture
independent, and should call into whatever iommu support the device has
to setup an approprite iommu mapping.  IOW, I don't see any need for
every architecture to provide its own dma_alloc_noncoherent() allocation
function - or indeed every iommu implementation to deal with the
allocation issues either.



More information about the linux-arm-kernel mailing list