[PATCH V2 3/5] perf mem: Clean up perf_mem_events__name()

Leo Yan leo.yan at linaro.org
Wed Dec 13 05:33:36 PST 2023


On Mon, Dec 11, 2023 at 01:39:36PM -0500, Liang, Kan wrote:
> On 2023-12-09 12:48 a.m., Leo Yan wrote:
> > On Thu, Dec 07, 2023 at 11:23:36AM -0800, kan.liang at linux.intel.com wrote:
> >> From: Kan Liang <kan.liang at linux.intel.com>
> >>
> >> Introduce a generic perf_mem_events__name(). Remove the ARCH-specific
> >> one.
> > 
> > Now memory event naming is arch dependent, this is because every arch
> > has different memory PMU names, and it appends specific configurations
> > (e.g. aarch64 appends 'ts_enable=1,...,min_latency=%u', and x86_64
> > appends 'ldlat=%u', etc).  This is not friendly for extension, e.g. it's
> > impossible for users to specify any extra attributes for memory events.
> > 
> > This patch tries to consolidate in a central place for generating memory
> > event names, its approach is to add more flags to meet special usage
> > cases, which means the code gets more complex (and more difficult for
> > later's maintenance).
> > 
> > I think we need to distinguish the event naming into two levels: in
> > util/mem-events.c, we can consider it as a common layer, and we maintain
> > common options in it, e.g. 'latency', 'all-user', 'all-kernel',
> > 'timestamp', 'physical_address', etc.  In every arch's mem-events.c
> > file, it converts the common options to arch specific configurations.
> > 
> > In the end, this can avoid to add more and more flags into the
> > structure perf_mem_event.
> 
> The purpose of this cleanup patch set is to remove the unnecessary
> __weak functions, and try to make the code more generic.

I understand weak functions are not very good practice.  The point is
weak functions are used for some archs have implemented a feature but
other archs have not.

I can think a potential case to use a central place to maintain the
code for all archs - when we want to support cross analysis.  But this
patch series is for supporting cross analysis, to be honest, I still
don't see benefit for this series, though I know you might try to
support hybrid modes.

> The two flags has already covered all the current usage.
> For now, it's good enough.
> 
> I agree that if there are more flags, it should not be a perfect
> solution. But we don't have the third flag right now. Could we clean up
> it later e.g., when introducing the third flag?
> 
> I just don't want to make the patch bigger and bigger. Also, I don't
> think we usually implement something just for future possibilities.

Fine for me, but please address two main concerns in next spin:

- Fix building failure in patch 01;
- Fix the concerned regression on Arm64/AMD archs in patch 04.  I will
  give a bit more explanation in another reply.


[...]

> >> -#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
> >> +#define E(t, n, s, l, a) { .tag = t, .name = n, .sysfs_name = s, .ldlat = l, .aux_event = a }
> >>  
> >>  struct perf_mem_event perf_mem_events_arm[PERF_MEM_EVENTS__MAX] = {
> >> -	E("spe-load",	"arm_spe_0/ts_enable=1,pa_enable=1,load_filter=1,store_filter=0,min_latency=%u/",	"arm_spe_0"),
> >> -	E("spe-store",	"arm_spe_0/ts_enable=1,pa_enable=1,load_filter=0,store_filter=1/",			"arm_spe_0"),
> >> -	E("spe-ldst",	"arm_spe_0/ts_enable=1,pa_enable=1,load_filter=1,store_filter=1,min_latency=%u/",	"arm_spe_0"),
> >> +	E("spe-load",	"%s/ts_enable=1,pa_enable=1,load_filter=1,store_filter=0,min_latency=%u/",	"arm_spe_0",	true,	0),
> >> +	E("spe-store",	"%s/ts_enable=1,pa_enable=1,load_filter=0,store_filter=1/",			"arm_spe_0",	false,	0),
> >> +	E("spe-ldst",	"%s/ts_enable=1,pa_enable=1,load_filter=1,store_filter=1,min_latency=%u/",	"arm_spe_0",	true,	0),
> > 
> > Arm SPE is AUX event, should set '1' to the field '.aux_event'.
> 
> It actually means whether an extra event is required with a mem_loads
> event.  I guess for ARM the answer is no.

Thanks for confirmation.

> > I am a bit suspect if we really need the field '.aux_event', the
> > '.aux_event' field is only used for generating event string.
> 
> No, it stores the event encoding for the extra event.
> ARM doesn't need it, so it's 0.

I searched a bit and confirmed '.aux_event' is only used in
util/mem-events.c and for 'perf record'.

I failed to connect the code with "it stores the event encoding for the
extra event".  Could you elaborate a bit for this?

Thanks,
Leo



More information about the linux-arm-kernel mailing list