[PATCH v4 4/4] perf mem: Support HITM for when mem_lvl_num is any
Leo Yan
leo.yan at linaro.org
Fri Mar 25 23:23:03 PDT 2022
On Thu, Mar 24, 2022 at 06:33:23PM +0000, Ali Saidi wrote:
> For loads that hit in a the LLC snoop filter and are fulfilled from a
> higher level cache on arm64 Neoverse cores, it's not usually clear what
> the true level of the cache the data came from (i.e. a transfer from a
> core could come from it's L1 or L2). Instead of making an assumption of
> where the line came from, add support for incrementing HITM if the
> source is CACHE_ANY.
>
> Since other architectures don't seem to populate the mem_lvl_num field
> here there shouldn't be a change in functionality.
>
> Signed-off-by: Ali Saidi <alisaidi at amazon.com>
> Tested-by: German Gomez <german.gomez at arm.com>
> Reviewed-by: German Gomez <german.gomez at arm.com>
> ---
> tools/perf/util/mem-events.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
> index e5e405185498..084977cfebef 100644
> --- a/tools/perf/util/mem-events.c
> +++ b/tools/perf/util/mem-events.c
> @@ -539,6 +539,15 @@ do { \
> stats->ld_llchit++;
> }
>
> + /*
> + * A hit in another cores cache must mean a llc snoop
> + * filter hit
> + */
> + if (lnum == P(LVLNUM, ANY_CACHE)) {
> + if (snoop & P(SNOOP, HITM))
> + HITM_INC(lcl_hitm);
> + }
This might break the memory profiling result for x86, see file
arch/x86/events/intel/ds.c:
97 void __init intel_pmu_pebs_data_source_skl(bool pmem)
98 {
99 u64 pmem_or_l4 = pmem ? LEVEL(PMEM) : LEVEL(L4);
...
105 pebs_data_source[0x0d] = OP_LH | LEVEL(ANY_CACHE) | REM | P(SNOOP, HITM);
106 }
Which means that it's possible that it's a remote access and the cache
level is ANY_CACHE, it's good to add checking for bit
PERF_MEM_REMOTE_REMOTE:
u64 remote = data_src->mem_remote;
/*
* A hit in another cores cache must mean a llc snoop
* filter hit
*/
if (lnum == P(LVLNUM, ANY_CACHE) && remote != P(REMOTE, REMOTE)) {
if (snoop & P(SNOOP, HITM))
HITM_INC(lcl_hitm);
}
Appreciate German's reviewing and testing, and sorry I jumped in very
late.
Thanks,
Leo
> +
> if (lvl & P(LVL, LOC_RAM) || lnum == P(LVLNUM, RAM)) {
> stats->lcl_dram++;
> if (snoop & P(SNOOP, HIT))
> --
> 2.32.0
>
More information about the linux-arm-kernel
mailing list