[PATCH V2 10/11] coresight: sink: Add TRBE driver

Fri Jan 15 07:43:09 EST 2021

On 1/15/21 5:29 AM, Anshuman Khandual wrote:
> 
> 
> On 1/13/21 8:58 PM, Suzuki K Poulose wrote:
>> Hi Anshuman,
>>
>> The driver looks overall good to me. Please find some minor comments below
>>
>> On 1/13/21 4:18 AM, Anshuman Khandual wrote:
>>> Trace Buffer Extension (TRBE) implements a trace buffer per CPU which is
>>> accessible via the system registers. The TRBE supports different addressing
>>> modes including CPU virtual address and buffer modes including the circular
>>> buffer mode. The TRBE buffer is addressed by a base pointer (TRBBASER_EL1),
>>> an write pointer (TRBPTR_EL1) and a limit pointer (TRBLIMITR_EL1). But the
>>> access to the trace buffer could be prohibited by a higher exception level
>>> (EL3 or EL2), indicated by TRBIDR_EL1.P. The TRBE can also generate a CPU
>>> private interrupt (PPI) on address translation errors and when the buffer
>>> is full. Overall implementation here is inspired from the Arm SPE driver.
>>>
>>> Cc: Mathieu Poirier <mathieu.poirier at linaro.org>
>>> Cc: Mike Leach <mike.leach at linaro.org>
>>> Cc: Suzuki K Poulose <suzuki.poulose at arm.com>
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual at arm.com>

...

>>> +
>>> +/*
>>> + * TRBE Buffer Management
>>> + *
>>> + * The TRBE buffer spans from the base pointer till the limit pointer. When enabled,
>>> + * it starts writing trace data from the write pointer onward till the limit pointer.
>>
>>
>>> + * When the write pointer reaches the address just before the limit pointer, it gets
>>> + * wrapped around again to the base pointer. This is called a TRBE wrap event, which
>>> + * generates a maintenance interrupt when operated in WRAP or STOP mode.
>>
>> According to the TRM, it is FILL mode, instead of STOP. So please change the above to:
>>
>> "operated in WRAP or FILL mode".
> 
> Updated.
> 
>>
>>
>>>      The write
>>> + * pointer again starts writing trace data from the base pointer until just before
>>> + * the limit pointer before getting wrapped again with an IRQ and this process just
>>> + * goes on as long as the TRBE is enabled.
>>
>> This could be dropped as it applies to WRAP/CIRCULAR buffer mode, which we don't use.
> 
> Probably this could be changed a bit to match the FILL mode. Because it is essential
> to describe the continuous nature of the buffer operation, even in the FILL mode.
> 
>   * After TRBE
>   * IRQ gets handled and enabled again, write pointer again starts writing trace data
>   * from the base pointer until just before the limit pointer before getting wrapped
>   * again with an IRQ and this process just goes on as long as the TRBE is enabled.
> 

The above doesn't parse well and kind of repeats the operation of TRBE which is
already explained above. How about :

 >>> + * When the write pointer reaches the address just before the limit pointer, it gets
 >>> + * wrapped around again to the base pointer. This is called a TRBE wrap event, which
 >>> + * generates a maintenance interrupt when operated in WRAP or STOP mode.

This driver uses FILL mode, where the TRBE stops the trace collection at wrap event.
The IRQ handler updates the AUX buffer and re-enables the TRBE with updated WRITE and
LIMIT pointers.

>>> +
>>> +static void *arm_trbe_alloc_buffer(struct coresight_device *csdev,
>>> +                   struct perf_event *event, void **pages,
>>> +                   int nr_pages, bool snapshot)
>>> +{
>>> +    struct trbe_buf *buf;
>>> +    struct page **pglist;
>>> +    int i;
>>> +
>>> +    if ((nr_pages < 2) || (snapshot && (nr_pages & 1)))
>>
>> This restriction on snapshot could be removed now, since we use the
>> full buffer.
> 
> Dropped only the second condition here i.e (snapshot && (nr_pages & 1).
> Just wondering if the aux buffer could work with a single page so that
> the first condition can also be dropped.

I think it is good to keep the restriction of 2 pages, as the WRITE_PTR
and the LIMIT_PTR must be page aligned. With a single page, you can't do
much, with writing into a partially filled buffer. This may be added
as a comment to explain the restriction.

>>> +
>>> +static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *handle)
>>> +{
>>> +    int ec = get_trbe_ec();
>>> +    int bsc = get_trbe_bsc();
>>> +
>>> +    WARN_ON(is_trbe_running());
>>> +    if (is_trbe_trg() || is_trbe_abort())
>>
>> We seem to be reading the TRBSR every single in these helpers. Could we optimise them
>> by passing the register value in ?
> 
> The same goes for get_trbe_ec() and get_trbe_bsc() as well. Probably all
> TRBSR field probing helpers should be modified to accept a TRBSR register
> value instead.
> 
>>
>> i.e
>>      u64 trbsr = get_trbe_status();
>>
>>      WARN_ON(is_trbe_runnign(trbsr))
>>      if (is_trbe_trg(trbsr) || is_trbe_abort(trbsr))
>>
>> For is_trbe_wrap() too
> 
> Yes.
> 
>>

>>
>> We should skip the driver init, if the kernel is unmapped at EL0,
>> as the TRBE can't safely write to the kernel virtual addressed
>> buffer when the CPU is running at EL0. This is unlikely, but we
>> should cover that case.
> 
> This should be sufficient or it needs a pr_err() as well ?
> 

Please add a pr_err() message to indicate why this failed. Otherwise
the user could be left with no clue.

Cheers
Suzuki