[PATCH] arm64: Revert L1_CACHE_SHIFT back to 6 (64-byte cache line size)
Will Deacon
will.deacon at arm.com
Thu Feb 22 08:58:39 PST 2018
On Thu, Feb 22, 2018 at 04:06:38PM +0000, Catalin Marinas wrote:
> Commit 97303480753e ("arm64: Increase the max granular size") increased
> the cache line size to 128 to match Cavium ThunderX, apparently for some
> performance benefit which could not be confirmed. This change, however,
> has an impact on the network packets allocation in certain
> circumstances, requiring slightly over a 4K page with a significant
> performance degradation.
>
> This patch reverts L1_CACHE_SHIFT back to 6 (64-byte cache line) while
> keeping ARCH_DMA_MINALIGN at 128. The cache_line_size() function was
> changed to default to ARCH_DMA_MINALIGN in the absence of a meaningful
> CTR_EL0.CWG bit field.
>
> In addition, if a system with ARCH_DMA_MINALIGN < CTR_EL0.CWG is
> detected, the kernel will force swiotlb bounce buffering for all
> non-coherent devices since DMA cache maintenance on sub-CWG ranges is
> not safe, leading to data corruption.
>
> Cc: Tirumalesh Chalamarla <tchalamarla at cavium.com>
> Cc: Timur Tabi <timur at codeaurora.org>
> Cc: Florian Fainelli <f.fainelli at gmail.com>
> Cc: Will Deacon <will.deacon at arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas at arm.com>
> ---
> arch/arm64/Kconfig | 1 +
> arch/arm64/include/asm/cache.h | 6 +++---
> arch/arm64/include/asm/dma-direct.h | 43 +++++++++++++++++++++++++++++++++++++
> arch/arm64/kernel/cpufeature.c | 9 ++------
> arch/arm64/mm/dma-mapping.c | 15 +++++++++++++
> arch/arm64/mm/init.c | 3 ++-
> 6 files changed, 66 insertions(+), 11 deletions(-)
> create mode 100644 arch/arm64/include/asm/dma-direct.h
[...]
> +static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size)
> +{
> + if (!dev->dma_mask)
> + return false;
> +
> + /*
> + * Force swiotlb buffer bouncing when ARCH_DMA_MINALIGN < CWG. The
> + * swiotlb bounce buffers are aligned to (1 << IO_TLB_SHIFT).
> + */
> + if (static_branch_unlikely(&swiotlb_noncoherent_bounce) &&
> + !is_device_dma_coherent(dev) &&
> + !is_swiotlb_buffer(dma_to_phys(dev, addr)))
> + return false;
> +
> + return addr + size - 1 <= *dev->dma_mask;
I can't think of a better way to do this and hopefully this won't actually
trigger in practice, so:
Acked-by: Will Deacon <will.deacon at arm.com>
Will
More information about the linux-arm-kernel
mailing list