[PATCH v2 2/2] perf mem: Support HITM for when mem_lvl_num is used
German Gomez
german.gomez at arm.com
Wed Mar 16 08:10:55 PDT 2022
On 16/03/2022 12:42, Leo Yan wrote:
> On Wed, Mar 16, 2022 at 11:43:52AM +0000, German Gomez wrote:
>
> [...]
>
>>>>> I had a look at the TRMs for the N1[1], V1[2] and N2[3] Neoverse cores
>>>>> (specifically the LL_CACHE_RD pmu events). If we were to assign a number
>>>>> to the system cache (assuming all caches are implemented):
>>>>>
>>>>> *For N1*, if L2 and L3 are implemented, system cache would follow at *L4*
>>>> To date no one has built 4 level though. Everyone has only built three.
>>> The N1SDP board advertises 4 levels (we use it regularly for testing perf patches)
>> That said, it's probably the odd one out.
>>
>> I'm not against assuming 3 levels. Later if there's is a strong need for L4, indeed we can go back and change it.
> Thanks for the info.
>
> For exploring cache hierarchy via sysFS is a good idea, the only one
> concern for me is: can we simply take the system cache as the same
> thing as the highest level cache? If so, I think another option is to
For Neoverse, it should be. LL_CACHE_RD pmu event says (if system cache is implemented):
* If CPUECTLR.EXTLLC is set: This event counts any cacheable read transaction which returns a data source of 'interconnect cache'.
> define a cache level as "PERF_MEM_LVLNUM_SYSTEM_CACHE" and extend the
> decoding code for support it.
>
> With PERF_MEM_LVLNUM_SYSTEM_CACHE, it can tell users clearly the data
> source from system cache, and users can easily map this info with the
> cache media on the working platform.
>
> In practice, I don't object to use cache level 3 at first step. At
> least this can meet the requirement at current stage.
Ok, I agree. I think for now it is a good compromise.
Detecting the caches seems like an additional/separate perf feature.
Thanks,
German
> Thanks,
> Leo
More information about the linux-arm-kernel
mailing list