[PATCH v3] coresight: tmc-etr: Speed up for bounce buffer in flat mode

Thu Sep 2 05:54:31 PDT 2021

Hi Robin,

On Wed, Sep 01, 2021 at 09:03:30PM +0100, Robin Murphy wrote:

[...]

> > @@ -600,6 +601,7 @@ static int tmc_etr_alloc_flat_buf(struct tmc_drvdata *drvdata,
> >   {
> >   	struct etr_flat_buf *flat_buf;
> >   	struct device *real_dev = drvdata->csdev->dev.parent;
> > +	ssize_t	aligned_size;
> >   	/* We cannot reuse existing pages for flat buf */
> >   	if (pages)
> > @@ -609,11 +611,18 @@ static int tmc_etr_alloc_flat_buf(struct tmc_drvdata *drvdata,
> >   	if (!flat_buf)
> >   		return -ENOMEM;
> > -	flat_buf->vaddr = dma_alloc_coherent(real_dev, etr_buf->size,
> > -					     &flat_buf->daddr, GFP_KERNEL);
> > -	if (!flat_buf->vaddr) {
> > -		kfree(flat_buf);
> > -		return -ENOMEM;
> > +	aligned_size = PAGE_ALIGN(etr_buf->size);
> > +	flat_buf->pages = alloc_pages_node(node, GFP_KERNEL | __GFP_ZERO,
> > +					   get_order(aligned_size));
> > +	if (!flat_buf->pages)
> > +		goto fail_alloc_pages;
> > +
> > +	flat_buf->vaddr = page_address(flat_buf->pages);
> > +	flat_buf->daddr = dma_map_page(real_dev, flat_buf->pages, 0,
> > +				       aligned_size, DMA_FROM_DEVICE);
> 
> Use dma_alloc_noncoherent() rather than open-coding this - bare
> alloc_pages() has no understanding of DMA masks, and you wouldn't want to
> end up in the worst case of dma_map_page() bounce-buffering your bounce
> buffer...

Will refine the code with dma_alloc_noncoherent(); it's much reliable
than the self writing code.

Thanks a lot for the suggestion!