[PATCH 0/4] coresight: Add ETR-PERF polling.

Denis Nikitin denik at google.com
Tue May 4 23:46:20 PDT 2021


On Tue, Apr 27, 2021 at 9:04 AM Leo Yan <leo.yan at linaro.org> wrote:
>
> On Tue, Apr 27, 2021 at 09:47:46AM -0600, Mathieu Poirier wrote:
>
> [...]
>
> > > 2) ETR polling ensures that more trace is collected across the entire
> > > trace session - seeking to reduce inconsistent capture volumes.
> >
> > I am not convinced disabling a sink to collect traces while an
> > event is active is the right way to go.  To me it will add (more) complexity to
> > the coresight subsystem for very little gains, if any.
> >
> > If I remember correctly Leo brought forward the exact same idea about a year ago
> > and after discussion, we all agreed the benefit would not be important enough to
> > offset the drawbacks.
> >
> > As usual I am open to discussion and my opinion is not set in stone.  But as I
> > mentioned I worry the feature will increase complexity in the driver and
> > produce dubious results.  And we also have to factor in usability which, as
> > Al pointed, out will be a problem.
>
> Just want to remind one thing for ETR polling.  From one perspective,
> the ETR polling mode is actually very similar with perf's snapshot
> mode.  E.g. we can use specific interval to send USR2 singal to perf
> tool to captcure CoreSight trace data, thus it also can record the
> trace data continuously.
>
> I can see a benefit from ETR polling mode is it might introduce less
> overhead than perf snapshot mode.  The kernel's mechanism (workqueue
> or kernel thread) will be much efficiency than perf's signal handling
> + SMP call with IPIs.
>
> So it's good to firstly understand if perf snapshot mode can meet the
> requirement or not.

We evaluated the patch on Chrome OS and I can confirm that the quality
of AutoFDO profiles greatly improved with the ETR polling.
Tested with per-thread and system-wide mode.

Without ETR polling the size of the collected ETM data was very
inconsistent on the same workload and could vary by a factor of two.
This, in turn, affects the quality of the AutoFDO profiles generated from ETM.
With ETR polling the data size became pretty stable.
Performance evaluation shows a similar consistency in performance gain
of AutoFDO optimization.
This, I think, supports the idea that data collection right now is sensitive
to the process scheduling and can be improved with ETR polling.

For the system-wide mode particularly we didn't see any other alternatives
to collect data periodically on a long-running workload.
We haven't tested snapshot mode though. The idea sounds interesting.
But small runtime overhead is crucial for the sampling profiler in the field
and if there is a noticeable difference we would incline towards the
ETR polling.

Thanks,
Denis

>
> Thanks,
> Leo



More information about the linux-arm-kernel mailing list