[PATCH v2 2/5] perf jevents: Match on highest version of Arm json file available

James Clark james.clark at arm.com
Wed Jul 12 07:06:03 PDT 2023



On 12/07/2023 12:33, John Garry wrote:
> On 12/07/2023 12:00, James Clark wrote:
>>
>>
>> On 12/07/2023 10:22, John Garry wrote:
>>> On 11/07/2023 11:18, James Clark wrote:
>>>>>> The highest valid version of json files should be used,
>>>>> What exactly does that mean?
>>>>>
>>>>> So it seems that you have CPUs with matching MIDR except variant and
>>>>> revision, but have different events, right?
>>>> Yes. In this case we changed how a metric is calculated in N2-r0p3
>>>> because the workaround that was needed for r0p0 is no longer needed
>>>> (CPU_CYCLES should not subtracted from stalls for topdown metrics
>>>> anymore).
>>>
>>> If there are only very subtle differences, then could we solve with
>>> metric expression function? We already do something else like this for
>>> literal #slots for arm64
>>
>> Possibly, but I'd have to add a new mechanism to expose the version
>> numbers to the metrics so it can be used in an expression. And I think
>> it could get very messy quite quickly and make the expression hard to
>> read. In fact in this specific case I'm not even sure how it would look.
>> Something like this (where #p is the p in r0p3)?
>>
>>   {
>>   "ArchStdEvent": "bad_speculation",
>>   "MetricExpr": "(100 * (((1 - (OP_RETIRED / OP_SPEC)) * (1 -
>> ((STALL_SLOT - (CPU_CYCLES if (#p >= 3) else 0) / (CPU_CYCLES *
>> #slots)))) + ((BR_MIS_PRED * 4) / CPU_CYCLES)))"
> 
> Could this formula for "CPU_CYCLES if (#p >= 3) else 0" be generalised
> with a literal (like what we do for #slots)?
> 

Yes that could work, then we wouldn't need the logic inside the formula.

>>   },
>>
>> In this case it's not so bad because we don't need to compare r as well,
>> but I think it could get worse if r was non 0 or there were more
>> significant differences to the metric.
>>
>> It also doesn't scale in the same way when we're using auto generated
>> metrics with the data that Arm is publishing. We're going to treat CPU
>> versions as a completely new set of JSONs if there are any
>> differences/fixes in the PMU stuff and just publish a whole new set.
>> That way any tool that's consuming these can just check the MIDR and use
>> whichever JSONS are most appropriate.
> 
> So far we have got away with ignoring revision and variant - why change
> this now just for these arm parts is a good question..
> 
>> In theory this shouldn't require
>> any manual intervention or re-writing of expressions. And it saves every
>> tool having to solve the same problem.
> 
> Sure, but we have tried to reduce duplication as much as possible, and
> example would be ArchStdEvents
> 

What's the end goal of reducing duplication? Is it to reduce the binary
size or to reduce human input?

For the binary size if most of these strings are the same then they are
de-duplicated and the impact on the binary size is very small.

If it's to save on human input, then not doing it this way has the
potential to be more work in the long run. At the moment we can just run
a script and generate the Arm JSONs using what Arm is publishing. We
might understand all these subtle de-duplication efforts, but anyone
coming in the future is going to have a hard time understanding.
Especially if they just want to re-run the script to update some text,
then there is a big risk of making a mistake when re-applying the manual
changes.

I'm not completely opposed to the idea of re-working this expression
this time so that it works in both places, but I think we should be sure
that we do it for the right reasons.

As an example for comparison, I tried a build with a single set of
events and both sets (from this change) and the difference in size is
1760 bytes. Is it really worth starting to edit formulas just to save
that much space?

>>
>> It's possible in the future that there are also small changes to the
>> description text or availability of the events between versions. It
>> hasn't happened yet, but it gives us the flexibility to do that too.
> 
> Here the (handful of) metrics are subtly different between these
> rev/variant, but all the events are the same. It's hard to justify
> duplicating almost everything just for that.
> 

If it saves time and reduces the chance of making mistakes then
personally I don't see an issue with the duplication.

>>
> 
> I am not sure on how to handle this, since we might be bombarded with
> support for more CPUs with the same MIDR issue and having quirks for all
> becomes unworkable.
> 

I really don't see this being used very often. And it will always be
used much less often than the task of adding entirely new CPUs, so it
won't cause the number of JSONs to grow faster than that already
existing work. To be honest there is a non-zero chance that it never
gets used again.

Although even if we never use it again we still might want to
re-generate the JSONS because of a typo fix or something so that task is
just a script run rather than script run, and then re-applying the
expression fixes.

> Can you just give the idea above on literals/functions in the metricexpr
> a go, to see how it looks? Maybe Ian has a good idea on similar solutions.
> 
> Thanks,
> John

Yes I can give it a go. Also interested to see what Ian thinks as well.

Thanks
James



More information about the linux-arm-kernel mailing list