oprofile and ARM A9 hardware counter

stephane eranian eranian at googlemail.com
Fri Jan 27 10:57:25 EST 2012


On Fri, Jan 27, 2012 at 4:54 PM, Will Deacon <will.deacon at arm.com> wrote:
> On Fri, Jan 27, 2012 at 03:45:53PM +0000, stephane eranian wrote:
>> Hi,
>
> Hi Stephane,
>
>> Ok, with the one-line patch [1], this works much better now.
>> No more wrap around a 4 billion cycles.
>
> Hurrah! Thanks Mans and Ming Lei for helping with this. Unfortunately, I
> remember Santosh had objections to this patch so that needs to be resolved.
>
Yes, this needs to be resolved ASAP.

>> Sampling is okay, though I noticed it tends to not get the
>> correct number of samples for a controlled run:
>>
>> $ perf record -e cycles -c 1009213 noploop 10
>> noploop for 10 seconds
>>
>> $ perf report -D | tail -20
>> cycles stats:
>>            TOTAL events:       9938
>>             MMAP events:         13
>>             COMM events:          2
>>             EXIT events:          2
>>         THROTTLE events:         12
>>       UNTHROTTLE events:         12
>>           SAMPLE events:       9897
>>
>> Should not get throttled samples. Should get abour 10k samples
>> but only seeing 9897. The max_rate limit is way higher
>> than what I set the period (1000 samples/sec). But then,
>> is 3.2.0 throttling is broken. I posted a patch to fix that
>> yesterday. I will try with my patch applied as well.
>
> Ok. Note that on ARM the PMU generates a standard IRQ (i.e. not an NMI) so
> you may miss samples if they occur during critical kernel sections (and if
> you look at a profile, spin_unlock_irqrestore will be quite high).
>
But I am only running a user space noploop. So it spends 99% in user space, no
critical section.

> A7 and A15 have the ability to filter counters based on privilege level, so
> you can get more accurate userspace counts there.

Ok, that's better. Need to update libpfm4 for A15 with priv levels then!

>
> Will



More information about the linux-arm-kernel mailing list