[PATCH RFC 10/12] dma: fix dma_sync when not all device DMA is equally coherent

The LS1046A features a cache-coherent interconnect and the drivers
configure the hardware appropriately, e.g. setting the FMan PRAM_MODE_GLOBAL
bit, so the written Ethernet Controllers snoop caches.

Yet, we use the standard arm64 cache maintenance routines when the MMU
is enabled and thus risk memory corruption if CPU prefetches receive buffers
in the time window between dma_map_single() cleaning them to
Point-of-Coherency and dma_unmap_single() invalidating them[1].

To properly solve this issue, we need to consult the newly added per-device
dma coherent attribute to decide whether to do manual cache maintenance.

[1]: https://lore.kernel.org/all/a5d6cc26-cd23-7c31-f56e-f6d535ea39b0@arm.com/

Signed-off-by: Ahmad Fatoum <a.fatoum at pengutronix.de>
 drivers/dma/map.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/dma/map.c b/drivers/dma/map.c
index 114c0f7db3bd..be0ee258cc59 100644
--- a/drivers/dma/map.c
+++ b/drivers/dma/map.c
@@ -25,7 +25,8 @@ dma_addr_t dma_map_single(struct device *dev, void *ptr, size_t size,
 	unsigned long addr = (unsigned long)ptr;
-	dma_sync_single_for_device(addr, size, dir);
+	if (dev_is_dma_coherent(dev) <= 0)
+		dma_sync_single_for_device(addr, size, dir);
 	return cpu_to_dma(dev, ptr);
@@ -35,5 +36,6 @@ void dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
 	unsigned long addr = (unsigned long)dma_to_cpu(dev, dma_addr);
-	dma_sync_single_for_cpu(addr, size, dir);
+	if (dev_is_dma_coherent(dev) <= 0)
+		dma_sync_single_for_cpu(addr, size, dir);

