oprofile and ARM A9 hardware counter

Will Deacon will.deacon at arm.com
Mon Jan 30 14:14:43 EST 2012


On Mon, Jan 30, 2012 at 05:45:19PM +0000, stephane eranian wrote:
> There you go, no attachment, not sure the omap list
> supports this.

Cheers Stephane.

> There is something quite interesting to observe.
> 
> While I run perf record -e cycles -F 100 noploop 10, I watch
> /proc/interrupts. The number of interrupts is way lower than
> expected. Therefore the number of samples is way too low:
> 
> $ perf record -e cycles -F 100 noploop 10
> $ perf report -D | tail -20
> cycles stats:
>            TOTAL events:        535
>             MMAP events:         11
>             COMM events:          2
>             EXIT events:          2
>           SAMPLE events:        520
>
> The delta in /proc/interrupts on CPU1 is 520 interrupts.

Yes, that is about half of what you'd expect. Running on my A9 platform
(vexpress) I get:

$ perf record -e cycles -F 100 noploop 10
$ perf report -D | tail -20
cycles stats:
           TOTAL events:       1007
            MMAP events:         18
            COMM events:          2
            EXIT events:          2
          SAMPLE events:        985

> So looks like the frequency adjustment which is hooked off of the
> timer tick is either not called at each timer tick, the timer ticks are
> not at regular interval, or the math is wrong.

My hunch is that that the interval is probably varying, but I don't know much
about OMAP4 and its clocks.

> If I go with the fixed period mode:
> $ perf stat -e cycles noploop 10
> noploop for 10 seconds
>  Performance counter stats for 'noploop 10':
>        10079156960 cycles                    #    0.000 GHz
>       10.004547117 seconds time elapsed
> 
> That means, if I want 100 samples/sec: = 10079156960/(10*100)=10079157
> $ perf record -e cycles -c 10079157 noploop 10
> $ perf report -D | tail -20
> cycles stats:
>            TOTAL events:       1003
>             MMAP events:         11
>             COMM events:          2
>             EXIT events:          2
>         THROTTLE events:          1
>       UNTHROTTLE events:          1
>           SAMPLE events:        986
> 
> Now, we're getting the right answer!

Just to confirm, for me:

$ perf stat -e cycles ./noploop 10
noploop for 10 seconds

 Performance counter stats for './noploop 10':

        4001163930 cycles                    #    0.000 GHz

      10.006534024 seconds time elapsed

$ perf record -e cycles -c 4001163 ./noploop 10
$ perf report -D | tail -20
  Aggregated stats:
           TOTAL events:       1020
            MMAP events:         18
            COMM events:          2
            EXIT events:          2
          SAMPLE events:        998
cycles stats:
           TOTAL events:       1020
            MMAP events:         18
            COMM events:          2
            EXIT events:          2
          SAMPLE events:        998

which is close enough :)

> We need to elucidate what's going on in perf_event_task_tick().
> I have tried with my throttling fix and it did not help. We are
> not subject to throttling with such a low rate.

Ok. I would start by looking at the clock ticks if I were you, since this
seems to be alright on my board.

Will



More information about the linux-arm-kernel mailing list