[PATCH 3/5] kvx: Implement dma handling primitives

Wed Mar 3 09:33:45 GMT 2021

On 03/03/2021 10:14, Lucas Stach wrote:

> Hi Jules,
>
> Am Dienstag, dem 02.03.2021 um 11:58 +0100 schrieb Jules Maselbas:
>> Hi Lucas and Ahmad,
>>
>> On Tue, Mar 02, 2021 at 11:14:09AM +0100, Lucas Stach wrote:
>>> Am Dienstag, dem 02.03.2021 um 09:37 +0100 schrieb Ahmad Fatoum:
>>>> Hello Jules, Yann,
>>>>
>>>> On 01.03.21 16:58, Jules Maselbas wrote:
>>>>> From: Yann Sionneau <ysionneau at kalray.eu>
>>>> Some comments inline. I am not a cache cohereny expert, so take
>>>> it with a grain of salt.
>>>>
>>>>> +static inline void *dma_alloc_coherent(size_t size, dma_addr_t *dma_handle)
>>>>> +{
>>>>> +	void *ret = xmemalign(PAGE_SIZE, size);
>>>>> +
>>>>> +	if (dma_handle)
>>>>> +		*dma_handle = (dma_addr_t)(uintptr_t)ret;
>>>>> +
>>>>> +	return ret;
>>>>> +}
>>>> This would imply that the CPU barebox is booting is coherent with all
>>>>
>>>> devices that barebox needs to access. Is that the case?
>>>>
>>>> (See below)
>>>>
>> This is bogus, memory is not coherent with all devices, this should be
>> handled by the mmu, which is currently not supported in our barebox port.
>> Using this can lead to coherency issues. We can either drop this
>> function, so that is leads to an error at link time, or add a call to
>> BUG for a runtime error.
>>
>> Right now we aren't using any driver that require dma_alloc_coherent,
>> but we use drivers that requires dma_alloc and dma_map_single instead.
> I would vote for a BUILD_BUG_ON_MSG in this function, so you get a
> compile time error and you can state what needs to be done in order to
> get rid of the failure.

If we define the function and put a BUILD_BUG_ON_MSG() inside, I am 
guessing that all builds will fail, right?

But we only want the builds that actually call this function to fail.

Maybe we can just define dma_alloc_coherent() as being a macro, to 
BUILD_BUG_ON_MSG.

Like:

#define dma_alloc_coherent(a, b) BUILD_BUG_ON_MSG(1, "dma_alloc_coherent 
is not supported yet on KVX. You would need to add MMU support to be 
able to map uncached pages")

What do you think?

>
>>>>> +/*
>>>>> + * The implementation of arch should follow the following rules:
>>>>> + *		map		for_cpu		for_device	unmap
>>>>> + * TO_DEV	writeback	none		writeback	none
>>>>> + * FROM_DEV	invalidate	invalidate(*)	invalidate	invalidate(*)
>>>>> + * BIDIR	writeback	invalidate	writeback	invalidate
>>>>> + *
>>>>> + * (*) - only necessary if the CPU speculatively prefetches.
>>>>> + *
>>>>> + * (see https://lkml.org/lkml/2018/5/18/979)
>>>>> + */
>>>>> +
>>>>> +void dma_sync_single_for_device(dma_addr_t addr, size_t size,
>>>>> +				enum dma_data_direction dir)
>>>>> +{
>>>>> +	switch (dir) {
>>>>> +	case DMA_FROM_DEVICE:
>>>>> +		kvx_dcache_invalidate_mem_area(addr, size);
>>> Why do you need to explicitly invalidate, but not flush? Even if the
>>> CPU speculatively prefetches, the coherency protocol should make sure
>>> to invalidate the speculatively loaded lines, right?
>> Since we don't have a coherent memory, here we need to invalidate L1
>> dcache to let the CPU see deivce's writes in memory.
>> Also every write goes through the cache, flush is not required.
> Ah, if all your caches are write-through that makes sense. Can you add
> a comment somewhere stating that this implementation assumes WT caches
> on KVX? This way we can avoid the confusion Ahamd and myself fell into
> when glancing over the code.
>
> Regards,
> Lucas
>
>