[PATCH v2 3/5] kvx: Implement dma handling primitives
Jules Maselbas
jmaselbas at kalray.eu
Mon May 17 03:56:59 PDT 2021
Hi Ahmad,
On Mon, May 17, 2021 at 12:42:21PM +0200, Ahmad Fatoum wrote:
> Hello Jules, Yann,
>
> On 05.03.21 19:33, Jules Maselbas wrote:
> > From: Yann Sionneau <ysionneau at kalray.eu>
> >
> > Signed-off-by: Jules Maselbas <jmaselbas at kalray.eu>
>
> [snip]
>
> > diff --git a/arch/kvx/lib/dma-default.c b/arch/kvx/lib/dma-default.c
> > new file mode 100644
> > index 000000000..2a4144696
> > --- /dev/null
> > +++ b/arch/kvx/lib/dma-default.c
> > @@ -0,0 +1,94 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +// SPDX-FileCopyrightText: 2021 Yann Sionneau <ysionneau at kalray.eu>, Kalray Inc.
> > +
> > +#include <dma.h>
> > +#include <asm/barrier.h>
> > +#include <asm/io.h>
> > +#include <asm/cache.h>
> > +#include <asm/sfr.h>
> > +#include <asm/sys_arch.h>
> > +
> > +/*
> > + * The implementation of arch should follow the following rules:
> > + * map for_cpu for_device unmap
> > + * TO_DEV writeback none writeback none
> > + * FROM_DEV invalidate invalidate(*) invalidate invalidate(*)
> > + * BIDIR writeback invalidate writeback invalidate
> > + *
> > + * (*) - only necessary if the CPU speculatively prefetches.
> > + *
> > + * (see https://lkml.org/lkml/2018/5/18/979)
> > + */
> > +
> > +void dma_sync_single_for_device(dma_addr_t addr, size_t size,
> > + enum dma_data_direction dir)
> > +{
> > + /* dcache is Write-Through: no need to flush to force writeback */
> > + switch (dir) {
> > + case DMA_FROM_DEVICE:
> > + invalidate_dcache_range(addr, addr + size);
> > + break;
> > + case DMA_TO_DEVICE:
> > + case DMA_BIDIRECTIONAL:
> > + /* allow device to read buffer written by CPU */
> > + wmb();
> > + break;
> > + default:
> > + BUG();
> > + }
> > +}
> > +
> > +void dma_sync_single_for_cpu(dma_addr_t addr, size_t size,
> > + enum dma_data_direction dir)
> > +{
> > + /* CPU does not speculatively prefetches */
> > + switch (dir) {
> > + case DMA_FROM_DEVICE:
> > + /* invalidate has been done during map/for_device */
> > + case DMA_TO_DEVICE:
> > + break;
> > + case DMA_BIDIRECTIONAL:
> > + invalidate_dcache_range(addr, addr + size);
> > + break;
> > + default:
> > + BUG();
> > + }
> > +}
>
> Both of these work on CPU pointers.
>
> > +
> > +#define KVX_DDR_ALIAS_OFFSET \
> > + (KVX_DDR_64BIT_RAM_WINDOW_BA - KVX_DDR_32BIT_RAM_WINDOW_BA)
> > +#define KVX_DDR_ALIAS_WINDOW \
> > + (KVX_DDR_64BIT_RAM_WINDOW_BA + KVX_DDR_ALIAS_OFFSET)
> > +
> > +/* Local smem is aliased between 0 and 16MB */
> > +#define KVX_SMEM_LOCAL_ALIAS 0x1000000ULL
> > +
> > +dma_addr_t dma_map_single(struct device_d *dev, void *ptr, size_t size,
> > + enum dma_data_direction dir)
> > +{
> > + uintptr_t addr = (uintptr_t) ptr;
> > +
> > + dma_sync_single_for_device(addr, size, dir);
>
> So this is correct.
>
> > +
> > + /* Local smem alias should never be used for dma */
> > + if (addr < KVX_SMEM_LOCAL_ALIAS)
> > + return addr + (1 + kvx_cluster_id()) * KVX_SMEM_LOCAL_ALIAS;
> > +
> > + if (dev->dma_mask && addr <= dev->dma_mask)
> > + return addr;
> > +
> > + if (addr >= KVX_DDR_ALIAS_WINDOW)
> > + return DMA_ERROR_CODE;
> > +
> > + addr -= KVX_DDR_ALIAS_OFFSET;
> > + if (dev->dma_mask && addr > dev->dma_mask)
> > + return DMA_ERROR_CODE;
> > +
> > + return addr;
> > +}
>
> Now you compute a dma_addr_t as CPU pointer - KVX_DDR_ALIAS_OFFSET.
>
> > +
> > +void dma_unmap_single(struct device_d *dev, dma_addr_t addr, size_t size,
> > + enum dma_data_direction dir)
> > +{
> > + dma_sync_single_for_cpu(addr, size, dir);
>
> And then that dma_addr_t is passed here without + KVX_DDR_ALIAS_OFFSET
> to get a CPU pointer out.
> So for DMA_BIDIRECTIONAL you'd invalidate the wrong cache region.
Yes, this is bogus but I believe in our case dma_unmap_single is never
called with DMA_BIDIRECTIONAL.
>
> I stumbled upon this, because I noticed that that kvx whould've failed to
> build starting with v2021.04.0, because following commits conflict with each other:
> 3f975f810bd3 ("dma: move dma_map/unmap_single from ARM to common code") commited on 2021-03-03
> 4808d6f80073 ("kvx: Implement dma handling primitives") commited on 2021-03-05
>
> Now, dma_map_single() is defined twice for kvx.
Yes, I noticed this as well when rebasing on barebox v2021.04.
> As dma_(un?)map_single is always called with a dev argument, couldn't you define the
> DMA offsets in the device tree and use the common drivers/dma/map.c implementation
> for these two functions?
Our custom implementation will be removed in favor of the common
implementation and the uses of dma-range.
I had to add a dma-range to the dwc2 device nodes in our device tree, so
when the driver does dma_map/unmap the common implementation will do the
correct remap to the aliased region (so addresses fits in 32bit).
However I do not know how Linux handle a mix of dma-range and iommu.
Since I didn't had much time to test this we are still using v2021.03.
Thanks,
Jules
More information about the barebox
mailing list