[PATCH 15/21] ARM: dma-mapping: always invalidate WT caches before DMA

Arnd Bergmann arnd at arndb.de
Fri Mar 31 03:38:45 PDT 2023


On Fri, Mar 31, 2023, at 11:35, Russell King (Oracle) wrote:
> On Fri, Mar 31, 2023 at 10:07:28AM +0100, Russell King (Oracle) wrote:
>> On Mon, Mar 27, 2023 at 02:13:11PM +0200, Arnd Bergmann wrote:
>> > From: Arnd Bergmann <arnd at arndb.de>
>> > 
>> > Most ARM CPUs can have write-back caches and that require
>> > cache management to be done in the dma_sync_*_for_device()
>> > operation. This is typically done in both writeback and
>> > writethrough mode.
>> > 
>> > The cache-v4.S (arm720/740/7tdmi/9tdmi) and cache-v4wt.S
>> > (arm920t, arm940t) implementations are the exception here,
>> > and only do the cache management after the DMA is complete,
>> > in the dma_sync_*_for_cpu() operation.
>> > 
>> > Change this for consistency with the other platforms. This
>> > should have no user visible effect.
>> 
>> NAK...
>> 
>> The reason we do cache management _after_ is to ensure that there
>> is no stale data. The kernel _has_ (at the very least in the past)
>> performed DMA to data structures that are embedded within other
>> data structures, resulting in cache lines being shared. If one of
>> those cache lines is touched while DMA is progressing, then we
>> must to cache management _after_ the DMA operation has completed.
>> Doing it before is no good.

What I'm trying to address here is the inconsistency between
implementations. If we decide that we always want to invalidate
after FROM_DEVICE, I can do that as part of the series, but then
I have to change most of the other arm implementations.

Right now, the only WT cache implementations that do the the
invalidation after the DMA are cache-v4.S (arm720 integrator and
clps711x), cache-v4wt.S (arm920/arm922 at91rm9200, clps711x,
ep93xx, omap15xx, imx1 and integrator), some sparc32 leon3 and
early xtensa.

Most architectures that have write-through caches (m68k,
microblaze) or write-back caches but no speculation (all other
armv4/armv5, hexagon, openrisc, sh, most mips, later xtensa)
only invalidate before DMA but not after.

OTOH, most machines that are actually in use today (armv6+,
powerpc, later mips, microblaze, riscv, nios2) also have to
deal with speculative accesses, so they end up having to
invalidate or flush both before and after a DMA_FROM_DEVICE
and DMA_BIDIRECTIONAL.

> It looks like the main offender of "touching cache lines shared
> with DMA" has now been resolved - that was the SCSI sense buffer,
> and was fixed some time ago:
>
> commit de25deb18016f66dcdede165d07654559bb332bc
> Author: FUJITA Tomonori <fujita.tomonori at lab.ntt.co.jp>
> Date:   Wed Jan 16 13:32:17 2008 +0900
>
> /if/ that is the one and only case, then we're probably fine, but
> having been through an era where this kind of thing was the norm
> and requests to fix it did not get great responses from subsystem
> maintainers, I just don't trust the kernel not to want to DMA to
> overlapping cache lines.

Thanks for digging that out, that is very useful. It looks like this
was around the same time as 03d70617b8a7 ("powerpc: Prevent memory
corruption due to cache invalidation of unaligned DMA buffer"), so
it may well have been related. I know we also had more recent 
problems with USB drivers trying to DMA to stack, which would 
also cause problems on non-coherent machines, but some of these were
only found after we introduced VMAP_STACK.

It would be nice to use KASAN prevent reads on cache lines that
have in-flight DMA.

     Arnd



More information about the linux-arm-kernel mailing list