[PATCH 3/5] kvx: Implement dma handling primitives

Wed Mar 3 09:14:47 GMT 2021

Hi Jules,

Am Dienstag, dem 02.03.2021 um 11:58 +0100 schrieb Jules Maselbas:
> Hi Lucas and Ahmad,
> 
> On Tue, Mar 02, 2021 at 11:14:09AM +0100, Lucas Stach wrote:
> > Am Dienstag, dem 02.03.2021 um 09:37 +0100 schrieb Ahmad Fatoum:
> > > Hello Jules, Yann,
> > > 
> > > On 01.03.21 16:58, Jules Maselbas wrote:
> > > > From: Yann Sionneau <ysionneau at kalray.eu>
> > > Some comments inline. I am not a cache cohereny expert, so take
> > > it with a grain of salt.
> > > 
> > > > +static inline void *dma_alloc_coherent(size_t size, dma_addr_t *dma_handle)
> > > > +{
> > > > +	void *ret = xmemalign(PAGE_SIZE, size);
> > > > +
> > > > +	if (dma_handle)
> > > > +		*dma_handle = (dma_addr_t)(uintptr_t)ret;
> > > > +
> > > > +	return ret;
> > > > +}
> > > 
> > > This would imply that the CPU barebox is booting is coherent with all
> > > 
> > > devices that barebox needs to access. Is that the case?
> > > 
> > > (See below)
> > > 
> This is bogus, memory is not coherent with all devices, this should be
> handled by the mmu, which is currently not supported in our barebox port.
> Using this can lead to coherency issues. We can either drop this
> function, so that is leads to an error at link time, or add a call to
> BUG for a runtime error.
> 
> Right now we aren't using any driver that require dma_alloc_coherent,
> but we use drivers that requires dma_alloc and dma_map_single instead.

I would vote for a BUILD_BUG_ON_MSG in this function, so you get a
compile time error and you can state what needs to be done in order to
get rid of the failure.

> > > > +/*
> > > > + * The implementation of arch should follow the following rules:
> > > > + *		map		for_cpu		for_device	unmap
> > > > + * TO_DEV	writeback	none		writeback	none
> > > > + * FROM_DEV	invalidate	invalidate(*)	invalidate	invalidate(*)
> > > > + * BIDIR	writeback	invalidate	writeback	invalidate
> > > > + *
> > > > + * (*) - only necessary if the CPU speculatively prefetches.
> > > > + *
> > > > + * (see https://lkml.org/lkml/2018/5/18/979)
> > > > + */
> > > > +
> > > > +void dma_sync_single_for_device(dma_addr_t addr, size_t size,
> > > > +				enum dma_data_direction dir)
> > > > +{
> > > > +	switch (dir) {
> > > > +	case DMA_FROM_DEVICE:
> > > > +		kvx_dcache_invalidate_mem_area(addr, size);
> > 
> > Why do you need to explicitly invalidate, but not flush? Even if the
> > CPU speculatively prefetches, the coherency protocol should make sure
> > to invalidate the speculatively loaded lines, right?
> Since we don't have a coherent memory, here we need to invalidate L1
> dcache to let the CPU see deivce's writes in memory.
> Also every write goes through the cache, flush is not required.

Ah, if all your caches are write-through that makes sense. Can you add
a comment somewhere stating that this implementation assumes WT caches
on KVX? This way we can avoid the confusion Ahamd and myself fell into
when glancing over the code.

Regards,
Lucas