oprofile and ARM A9 hardware counter
stephane eranian
eranian at googlemail.com
Mon Jan 30 15:45:44 EST 2012
On Mon, Jan 30, 2012 at 8:14 PM, Will Deacon <will.deacon at arm.com> wrote:
> On Mon, Jan 30, 2012 at 05:45:19PM +0000, stephane eranian wrote:
>> There you go, no attachment, not sure the omap list
>> supports this.
>
> Cheers Stephane.
>
>> There is something quite interesting to observe.
>>
>> While I run perf record -e cycles -F 100 noploop 10, I watch
>> /proc/interrupts. The number of interrupts is way lower than
>> expected. Therefore the number of samples is way too low:
>>
>> $ perf record -e cycles -F 100 noploop 10
>> $ perf report -D | tail -20
>> cycles stats:
>> TOTAL events: 535
>> MMAP events: 11
>> COMM events: 2
>> EXIT events: 2
>> SAMPLE events: 520
>>
>> The delta in /proc/interrupts on CPU1 is 520 interrupts.
>
> Yes, that is about half of what you'd expect. Running on my A9 platform
> (vexpress) I get:
>
> $ perf record -e cycles -F 100 noploop 10
> $ perf report -D | tail -20
> cycles stats:
> TOTAL events: 1007
> MMAP events: 18
> COMM events: 2
> EXIT events: 2
> SAMPLE events: 985
>
>> So looks like the frequency adjustment which is hooked off of the
>> timer tick is either not called at each timer tick, the timer ticks are
>> not at regular interval, or the math is wrong.
>
> My hunch is that that the interval is probably varying, but I don't know much
> about OMAP4 and its clocks.
>
Glad you tested this. At least, it seems the generic perf_event code
is allright.
I agree with you, something is fishy with the clocks. Just out of
curiosity, what is
the HZ value for your board? On my Panda it's 128Hz.
>> If I go with the fixed period mode:
>> $ perf stat -e cycles noploop 10
>> noploop for 10 seconds
>> Performance counter stats for 'noploop 10':
>> 10079156960 cycles # 0.000 GHz
>> 10.004547117 seconds time elapsed
>>
>> That means, if I want 100 samples/sec: = 10079156960/(10*100)=10079157
>> $ perf record -e cycles -c 10079157 noploop 10
>> $ perf report -D | tail -20
>> cycles stats:
>> TOTAL events: 1003
>> MMAP events: 11
>> COMM events: 2
>> EXIT events: 2
>> THROTTLE events: 1
>> UNTHROTTLE events: 1
>> SAMPLE events: 986
>>
>> Now, we're getting the right answer!
>
> Just to confirm, for me:
>
> $ perf stat -e cycles ./noploop 10
> noploop for 10 seconds
>
> Performance counter stats for './noploop 10':
>
> 4001163930 cycles # 0.000 GHz
>
> 10.006534024 seconds time elapsed
>
> $ perf record -e cycles -c 4001163 ./noploop 10
> $ perf report -D | tail -20
> Aggregated stats:
> TOTAL events: 1020
> MMAP events: 18
> COMM events: 2
> EXIT events: 2
> SAMPLE events: 998
> cycles stats:
> TOTAL events: 1020
> MMAP events: 18
> COMM events: 2
> EXIT events: 2
> SAMPLE events: 998
>
> which is close enough :)
>
>> We need to elucidate what's going on in perf_event_task_tick().
>> I have tried with my throttling fix and it did not help. We are
>> not subject to throttling with such a low rate.
>
> Ok. I would start by looking at the clock ticks if I were you, since this
> seems to be alright on my board.
>
> Will
More information about the linux-arm-kernel
mailing list