[PATCH v4 2/4] perf arm-spe: Use SPE data source for neoverse cores

Leo Yan leo.yan at linaro.org
Thu Apr 7 18:18:52 PDT 2022


On Thu, Apr 07, 2022 at 04:24:35PM +0100, German Gomez wrote:
> Hi,
> 
> On 31/03/2022 13:44, Leo Yan wrote:
> > [...]
> >>> I'd like to do this in a separate patch, but I have one other proposal. The
> >>> Neoverse cores L2 is strictly inclusive of the L1, so even if it's in the L1,
> >>> it's also in the L2. Given that the Graviton systems and afaik the Ampere
> >>> systems don't have any cache between the L2 and the SLC, thus anything from
> >>> PEER_CORE, LCL_CLSTR, or PEER_CLSTR would hit in the L2, perhaps we
> >>> should just set L2 for these cases? German, are you good with this for now? 
> >> Sorry for the delay. I'd like to also check this with someone. I'll try
> >> to get back asap. In the meantime, if this approach is also OK with Leo,
> >> I think it would be fine by me.
> 
> Sorry for the delay. Yeah setting it to L2 indeed looks reasonable for
> now. Somebody brought up the case of running SPE in a heterogeneous 
> system, but also we think might be beyond the scope of this change.
> 
> One very minor nit though. Would you be ok with renaming LCL to LOCAL 
> and CLSTR to CLUSTER? I sometimes mistead the former as "LLC".

Ali's suggestion is to use the format: highest_cache_level | MEM_SNOOP_PEER.

Simply to say, the highest cache level is where we snoop the cache
data with the highest cache level.  And we use an extra snoop op
MEM_SNOOP_PEER as the flag to indicate a peer snooping from the local
cluster or peer cluster.

Please review the more detailed discussion in another email.

> > Thanks for the checking internally.  Let me just bring up my another
> > thinking (sorry that my suggestion is float): another choice is we set
> > ANY_CACHE as cache level if we are not certain the cache level, and
> > extend snoop field to indicate the snooping logics, like:
> >
> >   PERF_MEM_SNOOP_PEER_CORE
> >   PERF_MEM_SNOOP_LCL_CLSTR
> >   PERF_MEM_SNOOP_PEER_CLSTR
> >
> > Seems to me, we doing this is not only for cache level, it's more
> > important for users to know the variant cost for involving different
> > snooping logics.
> >
> I see there's been some more messages I need to catch up with. Is the 
> intention to extend the PERF_MEM_* flags for this cset, or will it be
> left for a later change?

The plan is to extend the PERF_MEM_* flags in this patch set.

> In any case, I'd be keen to take another look at it and try to bring
> some more eyes into this.

Sure.  Please check at your side and thanks for confirmation.

Thanks,
Leo



More information about the linux-arm-kernel mailing list