[PATCH 0/2] Arm CMN-600 PMU driver
robin.murphy at arm.com
Fri Sep 18 09:06:17 EDT 2020
On 2020-09-09 17:54, Tuan Phan wrote:
> On Sep 9, 2020, at 2:34 AM, Zidenberg, Tsahi <tsahee at amazon.com> wrote:
>> On 05/08/2020, 15:57, "Robin Murphy" <robin.murphy at arm.com> wrote:
>>> At long last, here's an initial cut of the CMN PMU driver that's been
>>> festering in on-and-off development for years. It should be functionally
>>> complete now, although there is still scope for improving the current
>>> implementation (e.g. watchpoint register allocation could be cleverer).
>> Booted on graviton2 (using ACPI). Cache-fill counter value (both general and
>> bynodeid) responds as expected to memory pressure from user processes.
>> Tested-by: Tsahi Zidenberg <tsahee at amazon.com>
> Tested on Ampere Altra SOC platform. So far I have not seen any issues.
> Tested-by: Tuan Phan <tuanphan at os.amperecomputing.com <mailto:tphan at os.amperecomputing.com>>
>>> Of particular interest at this point is the user interface - is it
>>> sufficiently complete and useful? Is there any need for a third event
>>> targeting method in between "single node ID" and "all nodes"?
>> The one thing I'm missing (or didn't find) is a way for the user to determine
>> the list of relevant node ids for each node type or counter.
>> Using a wrong nodeid just gave me <not supported> as a counter value.
>> I don't think that it's required for a first version, though.
> Yes, that bothers me for a while. I either dump nodeid during discovery phase or have
> a node list excel file to get correct nodeid.
For end users who want to measure straightforward things like SLC usage,
the default of counting across all relevant nodes is usually what
they're going to want anyway, and conveniently means they don't need to
know the details. For targeted measurements, even if you know all the
node IDs, how will you know which CPU/memory controller/peripheral/etc.
they correspond to? Just like for CCN and CCI, to do pretty much
anything *meaningful* you'll need independent information about the
wider SoC configuration to know what's connected where, and that
information should inherently give you the node/port IDs anyway.
That said, I did end up writing some more comprehensive debugfs output
for my own development use, and also based on discussions around
tangential development cases like CHI cache-stashing research. I've
included that in v2, so if you think it deserves to be upstream please
do say so.
More information about the linux-arm-kernel