[PATCH] perf: arm_spe: Add barrier before enabling profiling buffer
James Clark
james.clark at linaro.org
Thu Feb 19 04:08:27 PST 2026
On 06/02/2026 9:50 am, James Clark wrote:
>
>
> On 03/02/2026 11:07 am, Will Deacon wrote:
>> On Tue, Feb 03, 2026 at 10:46:37AM +0000, James Clark wrote:
>>>
>>>
>>> On 02/02/2026 7:03 pm, Will Deacon wrote:
>>>> On Fri, Jan 23, 2026 at 04:03:53PM +0000, James Clark wrote:
>>>>> The Arm ARM known issues document [1] states that the architecture
>>>>> will
>>>>> be relaxed so that the profiling buffer must be correctly configured
>>>>> when ProfilingBufferEnabled() && !SPEProfilingStopped() &&
>>>>> PMBLIMITR_EL1.FM != DISCARD:
>>>>>
>>>>> R24557
>>>>>
>>>>> While the Profiling Buffer is enabled, profiling is not
>>>>> stopped, and
>>>>> Discard mode is not enabled, all of the following must be true:
>>>>>
>>>>> * The current write pointer must be at least one sample record
>>>>> below
>>>>> the write limit pointer.
>>>>>
>>>>> The same relaxation also says that writes may be completely ignored:
>>>>>
>>>>> When the Profiling Buffer is enabled, profiling is not stopped,
>>>>> and
>>>>> Discard mode is not enabled, the PE might ignore a direct write
>>>>> to any
>>>>> of the following Profiling Buffer registers, other than a
>>>>> direct write
>>>>> to PMBLIMITR_EL1 that clears PMBLIMITR_EL1.E from 1 to 0:
>>>>>
>>>>> * The current write pointer, PMBPTR_EL1.
>>>>> * The Limit pointer, PMBLIMITR_EL1.
>>>>> * PMBSR_EL1.
>>>>
>>>> Thinking about this some more, does that mean that the direct write to
>>>> PMBPTR_EL1 performs an indirect read of PMBLIMITR_EL1 so that it can
>>>> determine the write-ignore semantics? If so, doesn't that mean that
>>>> we'll get order against a subsequent direct write of PMBLIMITR_EL1
>>>> without an ISB thanks to table "D24-1 Synchronization requirements"
>>>> which says that an indirect read followed by a direct write doesn't
>>>> require synchronisation?
>>>>
>>>> There's also a sentence above the table stating:
>>>>
>>>> "Direct writes to System registers are not allowed to affect any
>>>> instructions appearing in program order before the direct write."
>>>>
>>>> so after all that, I'm not really sure why the ISB is required.
>>>>
>>>> Will
>>>
>>> We were under the impression that this was required for the SPU as it is
>>> treated as a separate entity than the PE.
>>>
>>> In "D17.9 Synchronization and Statistical Profiling" there is:
>>>
>>> INDWCG
>>>
>>> A Context Synchronization event guarantees that a direct write to a
>>> System register made by the PE in program order before the Context
>>> synchronization event are observable by indirect reads and indirect
>>> writes of the same System register made by a profiling operation
>>> relating to a sampled operation in program order after the Context
>>> synchronization event.
>>>
>>> That specifically mentions an indirect read following a direct write,
>>> which
>>> seems to contradict D24-1. Although I thought this is a special case for
>>> SPE.
>>
>> My reading of the the text above is that it is covering the direct write
>> -> indirect read case, whereas I think the case in the SPE driver that
>> we're considering for your patch is when we have an indirect read
>> followed by a direct write.
>>
>> Will
>
> Yeah, and that text also only applies to "profiling operations", not
> writes to PMBPTR and PMBLIMITR.
>
> Upon further investigation you are correct about the isb() not being
> required, even with the new relaxation. Seems like we just accepted that
> the relaxation required some change to the driver without really
> thinking about it. But yeah thanks for looking in detail and catching it.
>
> So we can drop this now. Sorry for the noise.
>
> James
>
Hi Will,
I'm back to drag this up again. So I think all of the above discussion
relies on the ordering given by the indirect read needed for the "might
ignore a direct write..." part. But it's _might_ ignore a direct write,
it's possible for an implementation to not do that, so there are two
possible implementations:
#1 Where there is an indirect read to give the write ignore outcome
#2 Where there is no write ignore outcome so it doesn't require an
indirect read
For #2 there's nothing to force the ordering. We're writing to two
different registers (PMBPTR_EL1 and PMBLIMITR_EL1) and we have to have
the PMBLIMITR_EL1 write come second for the buffer to be considered
configured correctly. For example if the old value of PMBPTR_EL1 is
higher than the new PMBLIMITR_EL1 and the write to PMBLIMITR_EL1 happens
first then it's misconfigured. That's why we think we need the isb() here.
Thanks
James
More information about the linux-arm-kernel
mailing list