[RFC PATCH v2 2/3] arm64: add IOMMU dma_ops

Robin Murphy robin.murphy at arm.com
Mon Mar 9 13:09:49 PDT 2015

Hi Russell,

On 09/03/15 17:59, Russell King - ARM Linux wrote:
>>>> For a noncoherent device, dma_map_single() will end up calling
>>> __dma_map_area() with the page offset and size of the original request, so
>>> the updated part gets flushed by VA, and the rest of the page isn't touched
>>> if it doesn't need to be. On the other hand if the page tables were
>>> allocated with dma_alloc_coherent() in the first place, then just calling
>>> dma_sync_single_for_device() for the updated region should suffice.
> That's wrong.  dma_sync_single_*() is not permitted to be called on
> coherently allocated memory.  Where coherent memory needs to be remapped,
> dma_sync_single_*() will panic the kernel.
> If it's in coherent memory, all you should need is the appropriate
> memory barrier to ensure that the DMA agent can see the writes.

You're quite right, that's the whole point of *coherent* allocations 
after all. I got my syncs and barriers muddled there.

>>> Where exactly would you call the dma_unmap? It seems a bit strange to
>>> be repeatedly calling dma_map and never calling dma_unmap. I don't see it
>>> explicitly forbidden in the docs anywhere to do this but it seems like
>>> it would be violating the implicit handoff of dma_map/dma_unmap.
>> I think ideally you'd call dma_map_page when you first create the page
>> table, dma_sync_single_for_device on any update, and dma_unmap_page when you
>> tear it down, and you'd also use the appropriate DMA addresses everywhere
>> instead of physical addresses.
> No.
> dma_map_page()			ownership changes CPU->DMA
> dma_sync_single_for_cpu()	ownership changes DMA->CPU
> dma_sync_single_for_device()	ownership changes CPU->DMA
> dma_unmap_page()		ownership changes DMA->CPU
> It's invalid to miss out the pairing that give those ownership changes.

Thanks for the clarification - the wording in DMA-API.txt rather implies 
that in the DMA_TO_DEVICE case you only have to sync the updated data 
/after/ writing it. For the sake of purely getting pages flushed, would 
it be more reasonable then to call dma_map_single() followed immediately 
by dma_unmap_single_attrs() with DMA_ATTR_SKIP_CPU_SYNC? Since we know 
the IOMMU can never write back to memory  (ones that can are a different 
issue) it would be nice to be able to skip the extra invalidations 
somehow, without too heinously abusing the API.


More information about the linux-arm-kernel mailing list