[PATCH v2 06/10] coresight: trbe: Fix handling of spurious interrupts

Suzuki K Poulose suzuki.poulose at arm.com
Fri Jul 30 05:57:47 PDT 2021


On 30/07/2021 06:15, Anshuman Khandual wrote:
> 
> 
> On 7/23/21 6:16 PM, Suzuki K Poulose wrote:
>> On a spurious IRQ, right now we disable the TRBE and then re-enable
>> it back, resetting the "buffer" pointers(i.e BASE, LIMIT and more
>> importantly WRITE) to the original pointers from the AUX handle.
>> This implies that we overwrite any trace that was written so far,
>> (by overwriting TRBPTR) while we should have ignored the IRQ.
> 
> The ideas was that a state (pointers) reset would improve the chances
> of not getting the spurious IRQ once again. This is assuming that some
> thing during this current state machine, had caused the spurious IRQ.
> Hence just restart it back from the beginning. Yes, it does lose some
> trace data but whats the real possibility of such spurious IRQs in the
> first place ?
> 
>>
>> This patch cleans the behavior, by only stopping the TRBE if the
>> IRQ was indeed raised, as we can read the TRBSR without stopping
>> the TRBE (Only writes to the TRBSR requires the TRBE disabled).
>> And also, on detecting a spurious IRQ after examining the TRBSR,
>> we simply re-enable the TRBE without touching the other parameters.
> 
> This makes sense. I was not sure if TRBSR could be safely read without
> actually stopping the TRBE.
> 
>>
>> Cc: Anshuman Khandual <anshuman.khandual at arm.com>
>> Cc: Mathieu Poirier <mathieu.poirier at linaro.org>
>> Cc: Mike Leach <mike.leach at linaro.org>
>> Cc: Leo Yan <leo.yan at linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose at arm.com>
>> ---
>>   drivers/hwtracing/coresight/coresight-trbe.c | 29 ++++++++++----------
>>   1 file changed, 15 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
>> index 62e1a08f73ff..503bea0137ae 100644
>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>> @@ -679,15 +679,16 @@ static int arm_trbe_disable(struct coresight_device *csdev)
>>   
>>   static void trbe_handle_spurious(struct perf_output_handle *handle)
>>   {
>> -	struct trbe_buf *buf = etm_perf_sink_config(handle);
>> +	u64 limitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
>>   
>> -	buf->trbe_limit = compute_trbe_buffer_limit(handle);
>> -	buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf);
>> -	if (buf->trbe_limit == buf->trbe_base) {
>> -		trbe_drain_and_disable_local();
>> -		return;
>> -	}
>> -	trbe_enable_hw(buf);
>> +	/*
>> +	 * If the IRQ was spurious, simply re-enable the TRBE
>> +	 * back without modifiying the buffer parameters to
> 
> Typo here 		^^^^^^ s/modifiying/modifying
> 
>> +	 * retain the trace collected so far.
>> +	 */
>> +	limitr |= TRBLIMITR_ENABLE;
>> +	write_sysreg_s(limitr, SYS_TRBLIMITR_EL1);
>> +	isb();
>>   }
>>   
>>   static void trbe_handle_overflow(struct perf_output_handle *handle)
>> @@ -760,12 +761,7 @@ static irqreturn_t arm_trbe_irq_handler(int irq, void *dev)
>>   	enum trbe_fault_action act;
>>   	u64 status;
>>   
>> -	/*
>> -	 * Ensure the trace is visible to the CPUs and
>> -	 * any external aborts have been resolved.
>> -	 */
>> -	trbe_drain_and_disable_local();
>> -
>> +	/* Reads to TRBSR_EL1 is fine when TRBE is active */
>>   	status = read_sysreg_s(SYS_TRBSR_EL1);
>>   	/*
>>   	 * If the pending IRQ was handled by update_buffer callback
>> @@ -774,6 +770,11 @@ static irqreturn_t arm_trbe_irq_handler(int irq, void *dev)
>>   	if (!is_trbe_irq(status))
> 
> Warn here that a non-related IRQ has been delivered to this handler ?
> But moving the trbe_drain_and_disable_local() later, enables it to
> return back immediately after detecting an unrelated IRQ.

Not really. There could be race with the update_buffer(), see the 
comment right above that. When that happens, we have disabled the
TRBE in the update_buffer(). Either case, we have nothing to do.

> 
>>   		return IRQ_NONE;
>>   
>> +	/*
>> +	 * Ensure the trace is visible to the CPUs and
>> +	 * any external aborts have been resolved.
>> +	 */
>> +	trbe_drain_and_disable_local();
>>   	clr_trbe_irq();
>>   	isb();
>>   
>>
> 
> Actually there are two types of spurious interrupts here.
> 
> 1. Non-TRBE spurious interrupt
> 
> Fails is_trbe_irq() test and needs to be returned immediately from
> arm_trbe_irq_handler(), after an warning for the platform IRQ
> delivery wiring.

Not necessarily warrant a WARNING. See above.

> 
> 2. TRBE spurious interrupt
> 
> Clears is_trbe_irq() and get handled in trbe_handle_spurious(). I
> still think leaving this unchanged might be better as it reduces
> the chance of getting further spurious TRBE interrupts.

How does it reduce the chances of getting another spurious interrupt ?
If the TRBE gets a spurious IRQ, that we cannot decode, I would rather
leave it as NOP.

Suzuki




More information about the linux-arm-kernel mailing list