[PATCH v5 2/5] perf: Add SNOOP_PEER flag to perf mem data struct

Liang, Kan kan.liang at linux.intel.com
Wed Apr 27 12:29:31 PDT 2022



On 4/27/2022 12:19 PM, Leo Yan wrote:
> Hi Kan,
> 
> On Mon, Apr 25, 2022 at 01:01:40PM -0400, Liang, Kan wrote:
>>
>>
>> On 4/24/2022 7:43 AM, Leo Yan wrote:
>>> On Sat, Apr 23, 2022 at 05:53:28AM -0700, Andi Kleen wrote:
>>>>
>>>>> Except SNOOPX_FWD means a no modified cache snooping, it also means it's
>>>>> a cache conherency from *remote* socket.  This is quite different from we
>>>>> define SNOOPX_PEER, which only snoop from peer CPU or clusters.
>>>>>
>>
>> The FWD doesn't have to be *remote*. The definition you quoted is just for
>> the "L3 Miss", which is indeed a remote forward. But we still have
>> cross-core FWD. See Table 19-101.
>>
>> Actually, X86 uses the PERF_MEM_REMOTE_REMOTE + PERF_MEM_SNOOPX_FWD to
>> indicate the remote FWD, not just SNOOPX_FWD.
> 
> Thanks a lot for the info.
> 
>>>>> If no objection, I prefer we could keep the new snoop type SNOOPX_PEER,
>>>>> this would be easier for us to distinguish the semantics and support the
>>>>> statistics for SNOOPX_FWD and SNOOPX_PEER separately.
>>>>>
>>>>> I overlooked the flag SNOOPX_FWD, thanks a lot for Kan's reminding.
>>>>
>>>> Yes seems better to keep using a separate flag if they don't exactly match.
>>>>
>>
>> Yes, I agree with Andi. If you still think the existing flag combination
>> doesn't match your requirement, a new separate flag should be introduced.
>> I'm not familiar with ARM. I think I will leave it to you and the maintainer
>> to decide.
> 
> It's a bit difficult for me to make decision is because now SNOOPX_FWD
> is not used in the file util/mem-events.c, so I am not very sure if
> SNOOPX_FWD has the consistent usage across different arches.

No, it's used in the file util/mem-events.c
See perf_mem__snp_scnprintf().

> 
> On the other hand, I sent a patch for 'peer' flag statistics [1], you
> could review it and it only stats for L2 and L3 cache level for local
> node.

If it's for the local node, why don't you use the hop level which is 
introduced recently by Power? The below seems a good fit.

PERF_MEM_LVLNUM_ANY_CACHE | PERF_MEM_HOPS_0?

/* hop level */
#define PERF_MEM_HOPS_0		0x01 /* remote core, same node */
#define PERF_MEM_HOPS_1		0x02 /* remote node, same socket */
#define PERF_MEM_HOPS_2		0x03 /* remote socket, same board */
#define PERF_MEM_HOPS_3		0x04 /* remote board */
/* 5-7 available */
#define PERF_MEM_HOPS_SHIFT	43

Thanks,
Kan

> 
> The main purpose for my sending this email is if you think the FWD can
> be the consistent for both arches, and even the new added display mode
> is also useful for x86 arch (we can rename it as 'fwd' display mode),
> then I am very glad to unify the flag.
> 
> Thanks,
> Leo
> 
> [1] https://lore.kernel.org/lkml/20220427155013.1833222-5-leo.yan@linaro.org/






More information about the linux-arm-kernel mailing list