[RFC] ARM DMA mapping TODO, v1

Russell King - ARM Linux linux at arm.linux.org.uk
Wed Apr 27 16:16:05 EDT 2011


On Wed, Apr 27, 2011 at 01:02:43PM +0200, Arnd Bergmann wrote:
> On Wednesday 27 April 2011, Russell King - ARM Linux wrote:
> > On Wed, Apr 27, 2011 at 10:56:49AM +0200, Arnd Bergmann wrote:
> > > We probably still need to handle both the coherent and noncoherent case
> > > in each dma_map_ops implementation, at least for those combinations where
> > > they matter (definitely the linear mapping). However, I think that using
> > > dma_mapping_common.h would let us use an architecture-independent dma_map_ops
> > > for the generic iommu code that Marek wants to introduce now.
> > 
> > The 'do we have an iommu or not' question and the 'do we need to do cache
> > coherency' question are two independent questions which are unrelated to
> > each other.  There are four unique but equally valid combinations.
> > 
> > Pushing the cache coherency question down into the iommu stuff will mean
> > that we'll constantly be fighting against the 'but this iommu works on x86'
> > shite that we've fought with over block device crap for years.  I have
> > no desire to go there.
> 
> Ok, I see. I believe we could avoid having to fight with the people that
> only care about coherent architectures if we just have two separate
> implementations of dma_map_ops in the iommu code, one for coherent
> and one for noncoherent DMA. Any architecture that only needs one
> of them would then only enable the Kconfig options for that implementation
> and not care about the other one.

But then we have to invent yet another whole new API to deal with the
cache coherency issues - which makes for more documentation, and eventually
more abuse because it won't quite do what architectures want it to do,
etc.

> Yes, that sounds definitely possible. I guess it could be as simple
> as having a flag somewhere in struct device if we want to make it
> architecture independent.

I was referring to a flag in the dma_ops to say whether the DMA ops
implementation requires DMA cache coherency.  In the case of swiotlb,
performing full DMA cache coherency is a pure waste of CPU cycles -
and probably makes DMA much more expensive than merely switching back
to using PIO.

I'm really not interested in producing "generic" interfaces which end up
throwing the baby out with the bath water when we already have a better
implementation in place - even if the hardware sucks.  That's not
forward progress as far as I'm concerned.

> As for making the default being to do cache handling, I'm not completely
> sure how that would work on architectures where most devices are coherent.
> If I understood the DRM people correctly, some x86 machine have noncoherent
> DMA in their GPUs while everything else is coherent.

Well, it sounds like struct device needs a flag to indicate whether it is
coherent or not - but exactly how this gets set seems to be architecture
dependent.  I don't see bus or driver code being able to make the necessary
decisions - eg, tulip driver on x86 would be coherent, but tulip driver on
ARM would be non-coherent.

Nevertheless, doing it on a per-device basis is definitely the right
answer.



More information about the linux-arm-kernel mailing list