[PATCH 1/2] ARM: mmu: optimize dma_alloc_coherent for cache-coherent DMA masters

Ahmad Fatoum a.fatoum at barebox.org
Thu Jan 15 04:05:52 PST 2026


If a device is DMA-capable and cache-coherent, it can be considerably
faster to keep shared memory cached, instead of mapping it uncached
unconditionally like we currently do.

This was very noticeable when using Virt I/O with KVM acceleration as
described in commit 3ebd05809a49 ("virtio: don't use DMA API unless
required").

In preparation for simplifying the code in the aforementioned commit,
consult dev_is_dma_coherent() before doing cache maintenance.

Signed-off-by: Ahmad Fatoum <a.fatoum at barebox.org>
---
 arch/arm/cpu/mmu-common.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arm/cpu/mmu-common.c b/arch/arm/cpu/mmu-common.c
index a1431c0ff461..2b22ab47cac8 100644
--- a/arch/arm/cpu/mmu-common.c
+++ b/arch/arm/cpu/mmu-common.c
@@ -50,9 +50,11 @@ void *dma_alloc_map(struct device *dev,
 		*dma_handle = (dma_addr_t)ret;
 
 	memset(ret, 0, size);
-	dma_flush_range(ret, size);
 
-	remap_range(ret, size, map_type);
+	if (!dev_is_dma_coherent(dev)) {
+		dma_flush_range(ret, size);
+		remap_range(ret, size, map_type);
+	}
 
 	return ret;
 }
@@ -70,8 +72,8 @@ void *dma_alloc_coherent(struct device *dev,
 void dma_free_coherent(struct device *dev,
 		       void *mem, dma_addr_t dma_handle, size_t size)
 {
-	size = PAGE_ALIGN(size);
-	remap_range(mem, size, MAP_CACHED);
+	if (!dev_is_dma_coherent(dev))
+		remap_range(mem, PAGE_ALIGN(size), MAP_CACHED);
 
 	free(mem);
 }
-- 
2.47.3




More information about the barebox mailing list