[PATCH 00/10] perf tools: Add support for CoreSight trace decoding

Mike Leach mike.leach at linaro.org
Tue Jan 16 04:35:26 PST 2018


On 11 January 2018 at 22:18, Mathieu Poirier <mathieu.poirier at linaro.org> wrote:
> On 11 January 2018 at 14:49, Kim Phillips <kim.phillips at arm.com> wrote:
>> On Thu, 11 Jan 2018 14:11:00 -0700
>> Mathieu Poirier <mathieu.poirier at linaro.org> wrote:
>>> On 11 January 2018 at 10:28, Kim Phillips <kim.phillips at arm.com> wrote:
>>> > On Thu, 11 Jan 2018 08:45:21 -0700
>>> > Mathieu Poirier <mathieu.poirier at linaro.org> wrote:
>>> >
>>> >> On 11 January 2018 at 05:23, Mark Brown <broonie at kernel.org> wrote:
>>> >> > On Wed, Jan 10, 2018 at 06:08:21PM -0600, Kim Phillips wrote:
>>> >> >> Mathieu Poirier <mathieu.poirier at linaro.org> wrote:
>>> >> >
>>> >> >> > Instructions on how to build and install the openCSD library are provided
>>> >> >> > in the HOWTO.md of the project repository.
>>> >> >
>>> >> >> Usually when a perf builder sees something they need "on," they - or,
>>> >> >> at least I - start querying the host's package manager for something
>>> >> >> that provides it (e.g., apt search/install libopencsd), but since no
>>> >> >> distro provides libopencsd, this is bad because it misleads the user.
>>> >> >
>>> >> > It's on the radar to push this at distros fairly soon.
>>> >
>>> > Adding packages to distros takes years, this patchset is being
>>> > submitted for inclusion *now*.  So until then, it would greatly
>>> > facilitate users if the relevant libopencsd source files were
>>> > self-contained within perf from the get go.
>>> I do not agree with you on the front that it takes years.  On the flip
>>> side it would take a significant amount of time and effort to refactor
>>> the openCSD library so that it can be added to the kernel tree.  This
>> The dtc wasn't refactored before it was added to the kernel tree.

The openCSD library consists of c. 140 C++ source and header files,
plus a few headers to form the libraries C-API, formatted in what is
an accepted standard within ARM - indent of 4 spaces and no tabs.
Now, afaik, this is unlikely to be acceptable to the kernel tree.

The decision to use C++ was made at the start of this project - the
objective was to produce a standalone, open source, reference trace
decoder library for CoreSight, to be used by any tooling that wanted
it, both open source and proprietary, for linux and windows OS, on x86
and native ARM, plus baremetal on ARM. The majority of stakeholders
involved at the start of the project expressed a preference for C++ -
in part as a significant amount of the code donated to the project was
already C++. The C-API was agreed as an addition to enable integration
with projects that preferred / required that - perf being one example.

Unlike intel PT, that has a single protocol, tightly bound to the core
it is run on, (similar to SPE in this respect), the library supports
multiple trace protocols, which are configured and controlled
independently of the cores that generate them. The library supports
all extant protocols - ETMv3 & 4, PTM and STM, plus the ability to
de-multiplex individual trace streams from the CS trace frames,
decoding these into a common generic standard, removing the need for
tooling to aware of the underlying protocol. Furthermore, given that
CoreSight hardware can support proprietary trace protocols (e.g. DSPs)
being added to and captured in the standard trace streams, the library
supports facilities to plug in additional external decoders.The
library also provides ancilliary tooling support - e.g. string
representations of protocol packets.

The library currently supports ETMv4.1 -> we have a requirement to
support v4.3 once these cores begin to appear, and it is inevitable
that the trace protocols will develop as the ARM architecture
develops. Adding the decoder library to the kernel tree will result in
code churn that is unnecessary for the kernel tree as the library is
developed, especially if functionality is added that is not used by A
class cores (see comment below re data trace).

>>> patchset is available now with a solution that follows what has
>>> already been done for dozens of other external library.  There is no
>>> point in delaying the inclusion of the functionality when an
>>> end-to-end solution exists.
>> See above: I'm not necessarily suggesting the code get refactored.
>>> >> >  Part of the
>>> >> > discussion was wanting to get things to the point where the tools using
>>> >> > the library were far enough along that we could be reasonably sure that
>>> >
>>> > Curious, what other tools are there?
>>> Ask around at ARM.
>> I'm asking the person that claimed it.

The intent is to use openCSD as a decoder within existing ARM tools -
though the library will need to be extended to decode Data trace
before this is possible. (ARM A class processors are instruction trace
only. )
Additionally the library test program itself is in fact a useful tool
for investigating trace capture / corruption / setup errors in
hardware. It is capable of listing trace packets from snapshots
captured by DS-5 or the open source CoreSight Access Library.

>>> >> > there weren't any problems that were going to require ABI breaks to fix
>>> >> > before pushing the library at distros since ABI churn isn't nice for
>>> >> > packagers to deal with.
>>> >
>>> > Why make perf the guinea pig?  Whatever, this doesn't preclude
>>> > adding the code into the tree; it can be removed years from now when
>>> > libopencsd becomes ubiquitous among distros.
>>> The same can be said about proceeding the other way around - the
>>> openCSD library can be added to the kernel tree later if it is deemed
>>> necessary.  Until then I really don't see why we'd prevent people from
>>> accessing the functionality.
>> Again, I'm not suggesting the code be refactored...
>>> >> > There's also a bit of a chicken and egg problem
>>> >> > in that it's a lot easier to get distros to package libraries that have
>>> >> > users available (some are not really bothered about this of course but
>>> >> > it still helps).
>>> >>
>>> >> Moreover including in the kernel tree every library that can
>>> >> potentially be used by the perf tools simply doesn't scale.
>>> >
>>> > This is a trace decoder library we're talking about:  there are no
>>> > others in perf's system features autodetection list.  And why wouldn't
>>> > adding such libraries scale?
>>> I don't see why a decoder library and say, libelf, need to be treated
>>> differently.
>> libelf is a mature library based on an industry-wide standard, not to
>> mention already packaged by most (all?) distros.

We re-factored the library name and the layout of the C-API necessary
header files in the latest release (v0.8) to allow for installation
into a linux system and the build changes made to perf.
Now, while it is true that we need to be careful that users are aware
they need to use the latest release, in general if the library is not
present on a system; if is not part of the distribution, it is very
easy to find using a simple search. Frankly whenever I appear to be
missing a library in linux the first thing I do is a web search, to
figure out what package I am missing (since it is not always obvious
to me what libraries lie in which packages).

Google 'libopencsd' and you see the github HOWTO.md for OpenCSD very quickly.

I think users interested in using the library will have the know-how
to find an install it if the perf build process marks it as missing.
The HOWTO will be maintained as part of the library to explain the
installation and integration with perf.

Given all the above, I cannot see any engineering benefit from adding
the source to the kernel tree. We have a stable and maintainable
solution with the library added to the perf build as an external



>>> >>  The perf
>>> >> tools project has come up with a very cleaver way to deal with
>>> >> external dependencies and I don't see why the OpenCSD library should
>>> >> be different.
>>> >
>>> > Again, the opencsd library is a decoder library:  this patchseries adds
>>> > it as a package dependency (when it isn't even a package in any
>>> > distro), and it's different in that it's the first decoder library to
>>> > be submitted as an external dependency (i.e., not fully built-in, like
>>> > Intel's, or even the Arm SPE's pending submission).
>>> I don't see why we absolutely need to do exactly the same as Intel.
>>> The library is public and this patchset neatly integrates it with the
>>> perf tools.
>> We don't, but it'd be more efficient, upstream-acceptance-wise, but as
>> you brought up above, we wouldn't be able to since we'd have to rewrite
>> libopencsd to conform to upstream codingstyle, etc., so I'm suggesting
>> we might look at a better enablement strategy like how the dtc works.
>> It'd be nice if the upstream maintainers would comment on what would be
>> acceptable instead of us going back and forth between each other.
> Agreed.
>>> >> >> Keeping the library external will also inevitably introduce more
>>> >> >> source level synchronization problems because the perf sources being
>>> >> >> built may not be compatible with their version of the library, whether
>>> >> >> due to new features like new trace hardware support, or API changes.
>>> >> >
>>> >> > Perf users installing from source rather than from a package (who do
>>> >> > tend to the more technical side even for kernel developers) already have
>>> >> > to cope with potentially installing at least dwarf, gtk2, libaudit,
>>> >> > libbfd, libelf, libnuma, libperl, libpython, libslang, libcrypto,
>>> >> > libunwind, libdw-dwarf-unwind, zlib, lzma, bpf and OpenJDK depending on
>>> >> > which features they want.  I'm not sure that adding one more library is
>>> >> > going to be the end of the world here, especially once the packaging
>>> >> > starts to filter through distros.  Until that happens at least people
>>> >> > are no worse off for not having the feature.
>>> >>
>>> >> I completely agree.  Just like any other package, people that want the
>>> >> very latest code need to install from source.
>>> >
>>> > A fully-integrated solution would work better for people, e.g., how are
>>> > people supposed to know what 'latest' is when there are separate,
>>> > unsynchronized git repos?
>>> The same applies to any of the other libraries perf is working with.
>> The packaged libraries?  They are stable:  they don't come in the form
>> of cloning a git repo and building from scratch.
>> The decoder libraries?  They are self-contained within perf.
>>> >> >> As Mark Brown (cc'd) mentioned on the Coresight mailing list, this may
>>> >> >> be able to be done the same way the dtc is incorporated into the
>>> >> >> kernel, where only its relevant sources are included and updated as
>>> >> >> needed:  see linux/scripts/dtc/update-dtc-source.sh.
>>> >> >
>>> >> > Bear in mind that we need dtc for essentially all kernel development on
>>> >> > ARM and when it was introduced it was a new requirement for existing
>>> >> > systems, it's a bit of a different case here where it's an optional
>>> >> > feature in an optional tool.
>>> >
>>> > That argument applies to Intel-PT, yet its decoder is self-contained
>>> > within perf: all non-x86 perf binaries are capable of decoding PT.
>>> > We'd want that for Arm Coresight where perf gets statically built to
>>> > run on much more constrained systems like Android.
>>> Traces can't be decoded properly without the support of external
>>> libraries, whether we are talking about PT or CS.
>> Not true; perf has PT decoding self-contained.
>> Thanks,
>> Kim
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Mike Leach
Principal Engineer, ARM Ltd.
Blackburn Design Centre. UK

More information about the linux-arm-kernel mailing list