[PATCH] arm64: Make ARCH_DMA_MINALIGN configurable

Catalin Marinas catalin.marinas at arm.com
Mon May 17 07:15:15 PDT 2021


On Mon, May 17, 2021 at 02:01:54PM +0200, Ard Biesheuvel wrote:
> On Mon, 17 May 2021 at 13:06, Catalin Marinas <catalin.marinas at arm.com> wrote:
> > On Mon, May 17, 2021 at 09:43:32AM +0200, Vincent Whitchurch wrote:
> > > ARCH_DMA_MINALIGN is hardcoded to 128, but this wastes memory if the
> > > kernel is only intended to be run on platforms with cache line sizes of
> > > 64 bytes.
> > >
> > > Make this configurable (hidden under CONFIG_EXPERT).  Setting this to 64
> > > bytes reduces the slab memory usage of my Cortex-A53-based system by
> > > ~6%, measured right after startup.
> >
> > I agree that we waste some memory since the kmalloc caches start from
> > 128 but I don't think a config option is the right.
> >
> > An option would be to try not to rely on the hard-coded
> > ARCH_DMA_MINALIGN when the slab caches are created but use
> > cache_line_size(). It's a bit tricky as the cache_line_size() returned
> > value may be tweaked by DT or PPTT after the boot caches have been
> > created (see commit 7b8c87b297a7).
> >
> > Another option I recall discussing with Arnd about two years ago was to
> > start with the default 128 at boot but add the smaller slab caches
> > later, once we have more information. This can be just another 64 byte
> > cache or even go all the way down to 8 byte if all the devices are
> > cache coherent.
> 
> ARCH_SLAB_MINALIGN is also used to statically align (members of)
> struct types, so doing this at runtime is going to have limited
> effect.

You probably mean ARCH_KMALLOC_MINALIGN. We don't touch
ARCH_SLAB_MINALIGN unless KASAN_{SW,HW}_TAGS is enabled and it is still
maximum 16 (I wonder if it's safe to reduce this to 8 as it might be the
case with KASAN_SW_TAGS).

> If a) ThunderX is the only platform we care about (do we?) that has
> 128 byte cachelines, and b) DMA is cache coherent on such platforms,
> couldn't we separate ARCH_SLAB_MINALIGN from ARCH_DMA_MINALIGN? I.e.,
> set the first to 64 and keep the second at 128?

ARCH_KMALLOC_MINALIGN is indeed the same as ARCH_DMA_MINALIGN. We can't
do much about when the requirement is DMA. For example, struct devres
has a data[] array aligned to ARCH_KMALLOC_MINALIGN with a comment that
it may be needed for DMA.

I can't tell how many SoCs out there have 128 byte cache lines (it can
be in a system level cache that reacts to cache maintenance by VA) and
are not fully coherent. I know there is one with 256 byte caches but
luckily it's cache coherent.

Even if we can't fix the build-time structure alignment, I think there's
still sufficient benefit in allowing smaller slab caches.

-- 
Catalin



More information about the linux-arm-kernel mailing list