[PATCH v4 2/4] perf arm-spe: Use SPE data source for neoverse cores

Leo Yan leo.yan at linaro.org
Thu Mar 31 05:44:25 PDT 2022


On Thu, Mar 31, 2022 at 01:28:58PM +0100, German Gomez wrote:
> Hi all,
> 
> It seems I gave the Review tags a bit too early this time. Apologies for
> the inconvenience. Indeed there was more interesting discussions to be
> had :)
> 
> (Probably best to remove by tags for the next re-spin)

Now worries, German.  Your review and testing are very helpful :)

> On 29/03/2022 15:32, Ali Saidi wrote:
> > [...]
> >
> >> I still think we should consider to extend the memory levels to
> >> demonstrate clear momory hierarchy on Arm archs, I personally like the
> >> definitions for "PEER_CORE", "LCL_CLSTR", "PEER_CLSTR" and "SYS_CACHE",
> >> though these cache levels are not precise like L1/L2/L3 levels, they can
> >> help us to map very well for the cache topology on Arm archs and without
> >> any confusion.  We could take this as an enhancement if you don't want
> >> to bother the current patch set's upstreaming.
> > I'd like to do this in a separate patch, but I have one other proposal. The
> > Neoverse cores L2 is strictly inclusive of the L1, so even if it's in the L1,
> > it's also in the L2. Given that the Graviton systems and afaik the Ampere
> > systems don't have any cache between the L2 and the SLC, thus anything from
> > PEER_CORE, LCL_CLSTR, or PEER_CLSTR would hit in the L2, perhaps we
> > should just set L2 for these cases? German, are you good with this for now? 
> 
> Sorry for the delay. I'd like to also check this with someone. I'll try
> to get back asap. In the meantime, if this approach is also OK with Leo,
> I think it would be fine by me.

Thanks for the checking internally.  Let me just bring up my another
thinking (sorry that my suggestion is float): another choice is we set
ANY_CACHE as cache level if we are not certain the cache level, and
extend snoop field to indicate the snooping logics, like:

  PERF_MEM_SNOOP_PEER_CORE
  PERF_MEM_SNOOP_LCL_CLSTR
  PERF_MEM_SNOOP_PEER_CLSTR

Seems to me, we doing this is not only for cache level, it's more
important for users to know the variant cost for involving different
snooping logics.

Thanks,
Leo



More information about the linux-arm-kernel mailing list