[PATCH] perf: arm_spe: Add barrier before enabling profiling buffer

James Clark james.clark at linaro.org
Thu Feb 19 04:08:27 PST 2026



On 06/02/2026 9:50 am, James Clark wrote:
> 
> 
> On 03/02/2026 11:07 am, Will Deacon wrote:
>> On Tue, Feb 03, 2026 at 10:46:37AM +0000, James Clark wrote:
>>>
>>>
>>> On 02/02/2026 7:03 pm, Will Deacon wrote:
>>>> On Fri, Jan 23, 2026 at 04:03:53PM +0000, James Clark wrote:
>>>>> The Arm ARM known issues document [1] states that the architecture 
>>>>> will
>>>>> be relaxed so that the profiling buffer must be correctly configured
>>>>> when ProfilingBufferEnabled() && !SPEProfilingStopped() &&
>>>>> PMBLIMITR_EL1.FM != DISCARD:
>>>>>
>>>>>     R24557
>>>>>
>>>>>     While the Profiling Buffer is enabled, profiling is not 
>>>>> stopped, and
>>>>>     Discard mode is not enabled, all of the following must be true:
>>>>>
>>>>>     * The current write pointer must be at least one sample record 
>>>>> below
>>>>>       the write limit pointer.
>>>>>
>>>>> The same relaxation also says that writes may be completely ignored:
>>>>>
>>>>>     When the Profiling Buffer is enabled, profiling is not stopped, 
>>>>> and
>>>>>     Discard mode is not enabled, the PE might ignore a direct write 
>>>>> to any
>>>>>     of the following Profiling Buffer registers, other than a 
>>>>> direct write
>>>>>     to PMBLIMITR_EL1 that clears PMBLIMITR_EL1.E from 1 to 0:
>>>>>
>>>>>     * The current write pointer, PMBPTR_EL1.
>>>>>     * The Limit pointer, PMBLIMITR_EL1.
>>>>>     * PMBSR_EL1.
>>>>
>>>> Thinking about this some more, does that mean that the direct write to
>>>> PMBPTR_EL1 performs an indirect read of PMBLIMITR_EL1 so that it can
>>>> determine the write-ignore semantics? If so, doesn't that mean that
>>>> we'll get order against a subsequent direct write of PMBLIMITR_EL1
>>>> without an ISB thanks to table "D24-1 Synchronization requirements"
>>>> which says that an indirect read followed by a direct write doesn't
>>>> require synchronisation?
>>>>
>>>> There's also a sentence above the table stating:
>>>>
>>>> "Direct writes to System registers are not allowed to affect any
>>>>    instructions appearing in program order before the direct write."
>>>>
>>>> so after all that, I'm not really sure why the ISB is required.
>>>>
>>>> Will
>>>
>>> We were under the impression that this was required for the SPU as it is
>>> treated as a separate entity than the PE.
>>>
>>> In "D17.9 Synchronization and Statistical Profiling" there is:
>>>
>>>    INDWCG
>>>
>>>    A Context Synchronization event guarantees that a direct write to a
>>>    System register made by the PE in program order before the Context
>>>    synchronization event are observable by indirect reads and indirect
>>>    writes of the same System register made by a profiling operation
>>>    relating to a sampled operation in program order after the Context
>>>    synchronization event.
>>>
>>> That specifically mentions an indirect read following a direct write, 
>>> which
>>> seems to contradict D24-1. Although I thought this is a special case for
>>> SPE.
>>
>> My reading of the the text above is that it is covering the direct write
>> -> indirect read case, whereas I think the case in the SPE driver that
>> we're considering for your patch is when we have an indirect read
>> followed by a direct write.
>>
>> Will
> 
> Yeah, and that text also only applies to "profiling operations", not 
> writes to PMBPTR and PMBLIMITR.
> 
> Upon further investigation you are correct about the isb() not being 
> required, even with the new relaxation. Seems like we just accepted that 
> the relaxation required some change to the driver without really 
> thinking about it. But yeah thanks for looking in detail and catching it.
> 
> So we can drop this now. Sorry for the noise.
> 
> James
> 

Hi Will,

I'm back to drag this up again. So I think all of the above discussion 
relies on the ordering given by the indirect read needed for the "might 
ignore a direct write..." part. But it's _might_ ignore a direct write, 
it's possible for an implementation to not do that, so there are two 
possible implementations:

#1 Where there is an indirect read to give the write ignore outcome
#2 Where there is no write ignore outcome so it doesn't require an
    indirect read

For #2 there's nothing to force the ordering. We're writing to two 
different registers (PMBPTR_EL1 and PMBLIMITR_EL1) and we have to have 
the PMBLIMITR_EL1 write come second for the buffer to be considered 
configured correctly. For example if the old value of PMBPTR_EL1 is 
higher than the new PMBLIMITR_EL1 and the write to PMBLIMITR_EL1 happens 
first then it's misconfigured. That's why we think we need the isb() here.

Thanks
James





More information about the linux-arm-kernel mailing list