oprofile and ARM A9 hardware counter

Will Deacon will.deacon at arm.com
Fri Jan 27 10:54:54 EST 2012


On Fri, Jan 27, 2012 at 03:45:53PM +0000, stephane eranian wrote:
> Hi,

Hi Stephane,

> Ok, with the one-line patch [1], this works much better now.
> No more wrap around a 4 billion cycles.

Hurrah! Thanks Mans and Ming Lei for helping with this. Unfortunately, I
remember Santosh had objections to this patch so that needs to be resolved.

> Sampling is okay, though I noticed it tends to not get the
> correct number of samples for a controlled run:
> 
> $ perf record -e cycles -c 1009213 noploop 10
> noploop for 10 seconds
> 
> $ perf report -D | tail -20
> cycles stats:
>            TOTAL events:       9938
>             MMAP events:         13
>             COMM events:          2
>             EXIT events:          2
>         THROTTLE events:         12
>       UNTHROTTLE events:         12
>           SAMPLE events:       9897
> 
> Should not get throttled samples. Should get abour 10k samples
> but only seeing 9897. The max_rate limit is way higher
> than what I set the period (1000 samples/sec). But then,
> is 3.2.0 throttling is broken. I posted a patch to fix that
> yesterday. I will try with my patch applied as well.

Ok. Note that on ARM the PMU generates a standard IRQ (i.e. not an NMI) so
you may miss samples if they occur during critical kernel sections (and if
you look at a profile, spin_unlock_irqrestore will be quite high).

A7 and A15 have the ability to filter counters based on privilege level, so
you can get more accurate userspace counts there.

Will



More information about the linux-arm-kernel mailing list