[PATCH v2 1/2] ARM: dma-mapping: Don't use outer_flush_range when the L2C is coherent
Catalin Marinas
catalin.marinas at arm.com
Fri Aug 14 10:43:41 PDT 2015
On Fri, Aug 14, 2015 at 06:12:55PM +0200, Gregory CLEMENT wrote:
> When a L2 cache controller is used in a system that provides hardware
> coherency, the entire outer cache operations are useless, and can be
> skipped. Moreover, on some systems, it is harmful as it causes
> deadlocks between the Marvell coherency mechanism, the Marvell PCIe
> controller and the Cortex-A9.
>
> In the current kernel implementation, the outer cache flush range
> operation is triggered by the dma_alloc function.
> This operation can be take place during runtime and in some
> circumstances may lead to the PCIe/PL310 deadlock on Armada 375/38x
> SoCs.
>
> This patch extends the __dma_clear_buffer() function to receive a
> boolean argument related to the coherency of the system. The same
> things is done for the calling functions.
>
> Reported-by: Nadav Haklai <nadavh at marvell.com>
> Signed-off-by: Gregory CLEMENT <gregory.clement at free-electrons.com>
> Cc: <stable at vger.kernel.org> # v3.16+
Most likely the patch won't apply on kernels before 4.3. But that's
fine, you can send separate patches to stable that work on those kernels
and reference the mainline commit.
> @@ -358,9 +364,14 @@ static int __init atomic_pool_init(void)
> if (!atomic_pool)
> goto out;
>
> + /*
> + * Here we don't know if the system is coherent, but it is
> + * harmless to use the non-coherent case in the
> + * __alloc_from_contiguous() call.
> + */
> if (dev_get_cma_area(NULL))
> ptr = __alloc_from_contiguous(NULL, atomic_pool_size, prot,
> - &page, atomic_pool_init, true);
> + &page, atomic_pool_init, true, false);
This comment is not entirely correct. The atomic pool is only used for
non-coherent allocations (the prot is pgprot_dmacoherent which means
Normal Non-cacheable memory). So we must pass false for is_coherent.
> @@ -474,7 +485,8 @@ static void *__alloc_remap_buffer(struct device *dev, size_t size, gfp_t gfp,
> {
> struct page *page;
> void *ptr = NULL;
> - page = __dma_alloc_buffer(dev, size, gfp);
> + /* This function is only called when the arch is non-coherent */
> + page = __dma_alloc_buffer(dev, size, gfp, false);
Nitpick: "__alloc_remap_buffer() is only called when the arch is
non-coherent". That's to avoid misreading as __dma_alloc_buffer() being
called only for non-coherent cases.
BTW, for other comments as well, instead of "arch" being coherent or not
I would say "device". We check this property on a per-device basis.
> @@ -601,7 +614,8 @@ static void *__alloc_simple_buffer(struct device *dev, size_t size, gfp_t gfp,
> struct page **ret_page)
> {
> struct page *page;
> - page = __dma_alloc_buffer(dev, size, gfp);
> + /* This function is only called when the arch is coherent */
> + page = __dma_alloc_buffer(dev, size, gfp, true);
Same nitpick as above, "__alloc_simple_buffer() is only called...".
Few minor things above on comments, otherwise:
Reviewed-by: Catalin Marinas <catalin.marinas at arm.com>
More information about the linux-arm-kernel
mailing list