[PATCH] ARM: l2c: fix support for HW coherent systems in PL310 cache

Catalin Marinas catalin.marinas at arm.com
Wed Aug 12 09:57:11 PDT 2015


On Wed, Aug 12, 2015 at 10:14:25AM +0200, Gregory CLEMENT wrote:
> On 11/08/2015 19:26, Catalin Marinas wrote:
> > On Tue, Aug 11, 2015 at 07:05:43PM +0200, Gregory CLEMENT wrote:
> >> From: Nadav Haklai <nadavh at marvell.com>
> >>
> >> When a PL310 cache is used in a system that provides hardware
> >> coherency, the entire outer cache operations are useless, and can be
> >> skipped.  Moreover, on some systems, it is harmful as it causes
> >> deadlocks between the Marvell coherency mechanism, the Marvell PCIe
> >> controller and the Cortex-A9.
> >>
> >> This commit extends a previous commit:
> >> 98ea2dba65932ffc456b6d7b11b8a0624e2f7b95 which added the io-coherent
> >> support for the PL310 cache by also disabling the outer cache flush
> >> range operation.
> >>
> >> In the current kernel implementation, the outer cache flush range
> >> operation is triggered by the dma_alloc function.
> >> This operation can be take place during runtime and in some
> >> circumstances may lead to the PCIe/PL310 deadlock on Armada 375/38x
> >> SoCs.
> > 
> > While this may work around the issue for your specific SoC, I think a
> > better fix is in DMA alloc code to avoid flushing caches for coherent
> > devices. This would be the __dma_clear_buffer() implementation which
> > isn't aware of whether the device is coherent or not.
> 
> Indeed, the other use of the outer cache flush range is done pretty
> early in the boot and should not be a problem.
> 
> What do you think of the following patch?
[...]
>  arch/arm/include/asm/outercache.h | 9 +++++++++
>  arch/arm/mm/cache-l2x0.c          | 1 +
>  arch/arm/mm/dma-mapping.c         | 6 ++++--
>  3 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/outercache.h b/arch/arm/include/asm/outercache.h
> index 563b92fc2f41..7f7bbfcf1d32 100644
> --- a/arch/arm/include/asm/outercache.h
> +++ b/arch/arm/include/asm/outercache.h
> @@ -35,6 +35,7 @@ struct outer_cache_fns {
>  	void (*sync)(void);
>  #endif
>  	void (*resume)(void);
> +	bool is_coherent;
> 
>  	/* This is an ARM L2C thing */
>  	void (*write_sec)(unsigned long, unsigned);
> @@ -56,6 +57,14 @@ static inline void outer_inv_range(phys_addr_t start, phys_addr_t end)
>  }
> 
>  /**
> + * outer_is_coherent - tell if the outer cache is io coherent
> + */
> +static inline bool outer_is_coherent(void)
> +{
> +	return outer_cache.is_coherent;
> +}
> +
> +/**
>   * outer_clean_range - clean dirty outer cache lines
>   * @start: starting physical address, inclusive
>   * @end: end physical address, exclusive
> diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c
> index 71b3d3309024..0071e276adc0 100644
> --- a/arch/arm/mm/cache-l2x0.c
> +++ b/arch/arm/mm/cache-l2x0.c
> @@ -1291,6 +1291,7 @@ static const struct l2c_init_data of_l2c310_coherent_data __initconst = {
>  		.flush_all   = l2c210_flush_all,
>  		.disable     = l2c310_disable,
>  		.resume      = l2c310_resume,
> +		.is_coherent = true,
>  	},
>  };

I don't really think we need these. If you need to check a device, we
already have is_device_dma_coherent(). Just pass bool argument to
__dma_clear_buffer() and fix the calling places. The __dma_alloc()
function also takes a "bool coherent" but in this case
__alloc_simple_buffer() is only ever used on coherent devices, so you
may not even need to check is_device_dma_coherent() (however, there are
some patches to add CMA support for coherent devices, I don't know
whether they are queued yet).

> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 1ced8a0f7a52..4d87df2c16f9 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -241,12 +241,14 @@ static void __dma_clear_buffer(struct page *page, size_t size)
>  			page++;
>  			size -= PAGE_SIZE;
>  		}
> -		outer_flush_range(base, end);
> +		if (!outer_is_coherent())
> +			outer_flush_range(base, end);
>  	} else {
>  		void *ptr = page_address(page);
>  		memset(ptr, 0, size);
>  		dmac_flush_range(ptr, ptr + size);
> -		outer_flush_range(__pa(ptr), __pa(ptr) + size);
> +		if (!outer_is_coherent())
> +			outer_flush_range(__pa(ptr), __pa(ptr) + size);
>  	}
>  }

This would make sense if we have a device which is outer coherent but
inner non-coherent, though I doubt this is the case you are trying to
fix.

-- 
Catalin



More information about the linux-arm-kernel mailing list