[PATCH 00/10] mm, arm64: Reduce ARCH_KMALLOC_MINALIGN below the cache line size

Catalin Marinas catalin.marinas at arm.com
Tue Apr 5 06:57:48 PDT 2022


Hi,

On arm64 ARCH_DMA_MINALIGN (and therefore ARCH_KMALLOC_MINALIGN) is 128.
While the majority of arm64 SoCs have a 64-byte cache line size (or
rather CWG - cache writeback granule), we chose a less than optimal
value in order to support all SoCs in a single kernel image.

The aim of this series is to allow smaller default ARCH_KMALLOC_MINALIGN
with kmalloc() caches configured at boot time to be safe when an SoC has
a larger DMA alignment requirement.

The first patch decouples ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN
with the aim to only use the latter in DMA-specific compile-time
annotations. ARCH_KMALLOC_MINALIGN becomes the minimum (static)
guaranteed kmalloc() alignment but not necessarily safe for non-coherent
DMA. Patches 2-7 change some drivers/ code to use ARCH_DMA_MINALIGN
instead of ARCH_KMALLOC_MINALIGN.

Patch 8 introduces the dynamic arch_kmalloc_minalign() and the slab code
changes to set the corresponding minimum alignment on the newly created
kmalloc() caches. Patch 10 defines arch_kmalloc_minalign() for arm64
returning cache_line_size() together with reducing ARCH_KMALLOC_MINALIGN
to 64. ARCH_DMA_MINALIGN remains 128 on arm64.

I don't have access to it but there's the Fujitsu A64FX with a CWG of
256 (the arm64 cache_line_size() returns 256). This series will bump the
smallest kmalloc cache to kmalloc-256. The platform is known to be fully
cache coherent (or so I think) and we decided long ago not to bump
ARCH_DMA_MINALIGN to 256. If problematic, we could make the dynamic
kmalloc() alignment on arm64 min(ARCH_DMA_MINALIGN, cache_line_size()).

This series is beneficial to arm64 even if it's only reducing the
kmalloc() minimum alignment to 64. While it would be nice to reduce this
further to 8 (or 16) on SoCs known to be fully DMA coherent, detecting
this is via arch_setup_dma_ops() is problematic, especially with late
probed devices. I'd leave it for an additional RFC series on top of
this (there are ideas like bounce buffering for non-coherent devices if
the SoC was deemed coherent).

Thanks.

Catalin Marinas (10):
  mm/slab: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN
  drivers/base: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  drivers/gpu: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  drivers/md: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  drivers/spi: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  drivers/usb: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
  mm/slab: Allow dynamic kmalloc() minimum alignment
  mm/slab: Simplify create_kmalloc_cache() args and make it static
  arm64: Enable dynamic kmalloc() minimum alignment

 arch/arm64/include/asm/cache.h |  1 +
 arch/arm64/kernel/cacheinfo.c  |  7 ++++++
 drivers/base/devres.c          |  4 ++--
 drivers/gpu/drm/drm_managed.c  |  4 ++--
 drivers/md/dm-crypt.c          |  2 +-
 drivers/spi/spidev.c           |  2 +-
 drivers/usb/core/buffer.c      |  8 +++----
 drivers/usb/misc/usbtest.c     |  2 +-
 include/linux/crypto.h         |  2 +-
 include/linux/slab.h           | 25 ++++++++++++++++-----
 mm/slab.c                      |  6 +----
 mm/slab.h                      |  5 ++---
 mm/slab_common.c               | 40 ++++++++++++++++++++++------------
 13 files changed, 69 insertions(+), 39 deletions(-)




More information about the linux-arm-kernel mailing list