[PATCH v2] coresight: tmc-etr: Speed up for bounce buffer in flat mode

Mon Jul 12 04:17:04 PDT 2021

On 12/07/2021 12:09, Leo Yan wrote:
> Hi Suzuki,
> 
> On Mon, Jul 12, 2021 at 10:55:32AM +0100, Suzuki Kuruppassery Poulose wrote:
> 
> [...]
> 
>>>    static void tmc_etr_sync_flat_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp)
>>>    {
>>> +	struct etr_flat_buf *flat_buf = etr_buf->private;
>>> +	struct device *real_dev = flat_buf->dev->parent;
>>> +
>>>    	/*
>>>    	 * Adjust the buffer to point to the beginning of the trace data
>>>    	 * and update the available trace data.
>>> @@ -648,6 +668,28 @@ static void tmc_etr_sync_flat_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp)
>>>    		etr_buf->len = etr_buf->size;
>>>    	else
>>>    		etr_buf->len = rwp - rrp;
>>> +
>>> +	if (etr_buf->offset + etr_buf->len > etr_buf->size) {
>>> +		int len1, len2;
>>> +
>>> +		/*
>>> +		 * If trace data is wrapped around, sync AUX bounce buffer
>>> +		 * for two chunks: "len1" is for the trace date length at
>>> +		 * the tail of bounce buffer, and "len2" is the length from
>>> +		 * the start of the buffer after wrapping around.
>>> +		 */
>>> +		len1 = etr_buf->size - etr_buf->offset;
>>> +		len2 = etr_buf->len - len1;
>>> +		dma_sync_single_for_cpu(real_dev,
>>> +					flat_buf->daddr + etr_buf->offset,
>>> +					len1, DMA_FROM_DEVICE);
>>> +		dma_sync_single_for_cpu(real_dev, flat_buf->daddr,
>>> +					len2, DMA_FROM_DEVICE);
>>
>> We always start tracing at the beginning of the buffer and the only reason
>> why we would get a wrap around, is when the buffer is full.
>> So you could as well sync the entire buffer in one go
>>
>> 		dma_sync_single_for_cpu(real_dev, flat_buf->daddr,
>> 					etr_buf->len, DMA_FROM_DEVICE);
> 
> I am doubt why you conclude "always start tracing at the beginning of
> the buffer"?  I read the driver but cannot find any code in the driver
> to reset rrp and rwp after fetching the trace data, or there have any
> implict operation to reset pointers?

The ETR is always programmed with the base address of the "ETR" buffer,
which is *not the same* as the perf ring buffer, since we always do
double buffering. We do not program the RRP/RWP of the ETR (except
for the SoC-600, where it is mandatory and we set them to the base
address). Thus there is no context associated with the ETR buffer.
But at the end of the run, we do read the RRP/ RWP to figure out
where the ETR has reached.

As for reseting the RRP / RWP, at the beginning of a session, is
done implicitly for the ETR (except for SoC-600 ETRs as explained
above) by the hardware to the base address.

Suzuki