oprofile and ARM A9 hardware counter
stephane eranian
eranian at googlemail.com
Fri Jan 27 10:57:25 EST 2012
On Fri, Jan 27, 2012 at 4:54 PM, Will Deacon <will.deacon at arm.com> wrote:
> On Fri, Jan 27, 2012 at 03:45:53PM +0000, stephane eranian wrote:
>> Hi,
>
> Hi Stephane,
>
>> Ok, with the one-line patch [1], this works much better now.
>> No more wrap around a 4 billion cycles.
>
> Hurrah! Thanks Mans and Ming Lei for helping with this. Unfortunately, I
> remember Santosh had objections to this patch so that needs to be resolved.
>
Yes, this needs to be resolved ASAP.
>> Sampling is okay, though I noticed it tends to not get the
>> correct number of samples for a controlled run:
>>
>> $ perf record -e cycles -c 1009213 noploop 10
>> noploop for 10 seconds
>>
>> $ perf report -D | tail -20
>> cycles stats:
>> TOTAL events: 9938
>> MMAP events: 13
>> COMM events: 2
>> EXIT events: 2
>> THROTTLE events: 12
>> UNTHROTTLE events: 12
>> SAMPLE events: 9897
>>
>> Should not get throttled samples. Should get abour 10k samples
>> but only seeing 9897. The max_rate limit is way higher
>> than what I set the period (1000 samples/sec). But then,
>> is 3.2.0 throttling is broken. I posted a patch to fix that
>> yesterday. I will try with my patch applied as well.
>
> Ok. Note that on ARM the PMU generates a standard IRQ (i.e. not an NMI) so
> you may miss samples if they occur during critical kernel sections (and if
> you look at a profile, spin_unlock_irqrestore will be quite high).
>
But I am only running a user space noploop. So it spends 99% in user space, no
critical section.
> A7 and A15 have the ability to filter counters based on privilege level, so
> you can get more accurate userspace counts there.
Ok, that's better. Need to update libpfm4 for A15 with priv levels then!
>
> Will
More information about the linux-arm-kernel
mailing list