[PATCH 0/4] coresight: Add ETR-PERF polling.

Mike Leach mike.leach at linaro.org
Tue Apr 27 15:41:01 BST 2021


Hi Mathieu,

I thought I'd add a little backgound to what has been said so far...

On Tue, 27 Apr 2021 at 11:43, Al Grant <Al.Grant at arm.com> wrote:
>
> > Hi Daniel,
> >
> > On Wed, Apr 21, 2021 at 02:04:09PM +0200, Daniel Kiss wrote:
> > > This series adds a feature to ETR-PERF that sync the ETR buffer to
> > > perf periodically. This is really handy when the system wide trace is
> > > used because in this case the perf won't sync during the trace. In a
> > > per-thread setup the traced program might not go to the kernel
> > > frequvently enought to collect trace. Polling helps in both usecases. Can be
> > used with strobing.
> > > Tuning polling period is challanging, I'm working on an additional
> > > patch that adds some metrics to help tune the polling period.
> > >
> >
> > Suzuki and Leo have already commented on a number of problems with this set
> > and as such I will concentrate on the general idea.
> >
> > Over the years we have thought long and hard about fixing the overflow issues
> > created by the lack of interrupt when a sink gets full, installing a timer to empty
> > the sink buffer at regular intervals is one of them.  Ultimately we haven't moved
> > forward with the idea because it requires to stop the sink when an event is
> > active, something that introduces more trace data loss.
> >
> > To me this kind of interval snapshot should be achieved using Mike's new
> > strobing feature that came bundled with the complex configuration framework,
> > available on next-ETE-TRBE[1].  I will rebase that branch to 5.13-rc1 when it is
> > released in a couple of weeks from now.
>
> It's important to understand what strobing is. It acts internally to the ETM
> and switches the ETM on for a time and then off for a time. It is as the
> name suggests, like a stroboscope (or a lighthouse).
>
> There is no synchronization between the on-periods of different ETMs.
> When you have multiple ETMs funnelling into a common ETR, strobing
> does not guarantee you a window where you can safely harvest the buffer.
> It achieves a reduction in the overall bandwidth of trace being dumped
> into the buffer, and there may be times when no trace is being written
> at all because all the ETMs are in their off-period.
>
> At worst, it may create a false sense of security - tests that consistently
> fail without strobing, may pass often enough with strobing to create the
> impression that strobing has solved the problem. But these tests are also
> likely to fail eventually with strobing. To fix this problem without
> disabling either ETR or ETMs you would have to guarantee that you can
> harvest the ETR buffer in less time than it takes to fill it. That would need
> very careful quntitative arguments to be made about:
>
>  - the rate of trace generation by each ETM (as modified by strobing)
>
>  - the number of ETMs writing into the buffer
>
>  - the time available to the kernel to harvest the buffer
>
> So if there are 10 ETMs generating trace at average 1Gb/s into a 1Mb
> buffer, the buffer will fill in 100us, and that gives the kernel 100us to
> harvest the buffer before its read pointer is caught up by the ETR's
> advancing write pointer. If strobing is used to reduce average ETM rate
> to 100Mb/s the kernel has 1ms to read the buffer, and so on. In short
> the kernel must *guarantee* a minimum readout rate equal to the
> maximum aggregate write rate of the ETMs. But can the kernel
> guarantee any minimum readout rate at all?
>
> The alternative would be double-buffering the ETR, which we've
> also discussed - so while the kernel is harvesting the contents of one
> buffer, the ETR is writing (and possibly wrapping) the other.
> Some trace will still be lost but it does mean the kernel will be
> harvesting monotonically increasing sequences of trace, and won't be
> seeing artefacts from its reads colliding with the ETR's writes.
>
> Al
>

As Al mentions, ETR polling is designed to solve a different issue
than ETM strobing.  These two techniques can be used together or
separately.

It was noticed by users that the amount of trace captured during a
given trace run would vary greatly even when tracing the same
application for the same length of time.
This was also found to be sensitive to process scheduling - frequent
re-scheds did seem to result in more frequent ETR updates and more
trace data collected. If perf does not wake up during a trace run then
the ETR may wrap mulitple times and all the data  will be a single
buffer biased towards the end of the trace session.

ETR polling is designed to ensure that more trace data is collected
consistently across the whole of the trace session. There are issues
of course, with stopping collection without stopping the sources. -
shared to some extent by the ETE / TRBE combination.
This can result in incomplete packets and other trace discontinuities.
For this reason it is necessary to ensure that the decoder is
restarted for each block of trace captured  - which is where the patch
set from James that does this using AUX records in perf to correctly
split the AUXTRACE records into valid blocks is needed.

In summary:-
1) ETM strobing samples trace to allow greater coverage of the program
being traced for a given buffer. This is useful when building
statistical profiles such as for AutoFDO
2) ETR polling ensures that more trace is collected across the entire
trace session - seeking to reduce inconsistent capture volumes.
3) Use AUX records to split the AUXTRACE buffer into valid capture
blocks and reset the decoder at the start of these blocks. This is
essential for ETE+TRBE, the ETR polling, and systems where we are
seeing hardware errata around the flush process causing similar
spurious packets. (an alternative for the ETR polling / flush errata
might be to insert barrier packets to force a decoder reset for every
ETR block copied to the perf buffer - but this does not work for
ETE/TRBE that uses no CoreSight formatted framing).

Regards

Mike


>
> >
> > Thanks,
> > Mathieu
> >
> > PS: Always run your work through checkpatch.pl before sending a patchset for
> > review.
> >
> > [1].
> > https://git.kernel.org/pub/scm/linux/kernel/git/coresight/linux.git/log/?h=next-
> > ETE-TRBE
> >
> > > Daniel Kiss (4):
> > >   coresight: tmc-etr: Advance buffer pointer in sync buffer.
> > >   coresight: tmc-etr: Track perf handler.
> > >   coresight: etm-perf: Export etm_event_cpu_path.
> > >   coresight: Add ETR-PERF polling.
> > >
> > >  .../testing/sysfs-bus-coresight-devices-tmc   |   8 +
> > >  drivers/hwtracing/coresight/Makefile          |   2 +-
> > >  .../hwtracing/coresight/coresight-etm-perf.c  |  10 +-
> > >  .../hwtracing/coresight/coresight-etm-perf.h  |   1 +
> > >  .../coresight/coresight-etr-perf-polling.c    | 316 ++++++++++++++++++
> > >  .../coresight/coresight-etr-perf-polling.h    |  42 +++
> > >  .../hwtracing/coresight/coresight-tmc-core.c  |   2 +
> > >  .../hwtracing/coresight/coresight-tmc-etr.c   |  22 +-
> > >  drivers/hwtracing/coresight/coresight-tmc.h   |   2 +
> > >  9 files changed, 401 insertions(+), 4 deletions(-)  create mode
> > > 100644 drivers/hwtracing/coresight/coresight-etr-perf-polling.c
> > >  create mode 100644
> > > drivers/hwtracing/coresight/coresight-etr-perf-polling.h
> > >
> > > --
> > > 2.25.1
> > >
> > _______________________________________________
> > CoreSight mailing list
> > CoreSight at lists.linaro.org
> > https://lists.linaro.org/mailman/listinfo/coresight



--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK



More information about the linux-arm-kernel mailing list