oprofile and ARM A9 hardware counter

Ming Lei ming.lei at canonical.com
Fri Feb 17 00:24:02 EST 2012


On Fri, Feb 17, 2012 at 2:08 AM, Will Deacon <will.deacon at arm.com> wrote:

>
> The more I think about this, the more I think that the overflow parameter to
> armpmu_event_update needs to go. It was introduced to prevent massive event
> loss in non-sampling mode, but I think we can get around that by changing
> the default sample_period to be half of the max_period, therefore giving
> ourselves a much better chance of handling the interrupt before new wraps
> around past prev.
>
> Ming Lei - can you try the following please? If it works for you, then I'll
> do it properly and kill the overflow parameter altogether.

Of course, it does work for the problem reported by Stephane since
it changes the delta computation basically as I did, but I am afraid that
it may be not good enough for the issue fixed in a737823d ("ARM: 6835/1:
perf: ensure overflows aren't missed due to IRQ latency").

>
> Thanks,
>
> Will
>
> git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
> index 5bb91bf..ef597a3 100644
> --- a/arch/arm/kernel/perf_event.c
> +++ b/arch/arm/kernel/perf_event.c
> @@ -193,13 +193,7 @@ again:
>                             new_raw_count) != prev_raw_count)
>                goto again;
>
> -       new_raw_count &= armpmu->max_period;
> -       prev_raw_count &= armpmu->max_period;
> -
> -       if (overflow)
> -               delta = armpmu->max_period - prev_raw_count + new_raw_count + 1;
> -       else
> -               delta = new_raw_count - prev_raw_count;
> +       delta = (new_raw_count - prev_raw_count) & armpmu->max_period;
>
>        local64_add(delta, &event->count);
>        local64_sub(delta, &hwc->period_left);
> @@ -518,7 +512,7 @@ __hw_perf_event_init(struct perf_event *event)
>        hwc->config_base            |= (unsigned long)mapping;
>
>        if (!hwc->sample_period) {
> -               hwc->sample_period  = armpmu->max_period;
> +               hwc->sample_period  = armpmu->max_period >> 1;

If you assume that the issue addressed by a737823d can only happen in
non-sample situation, Peter's idea of u32 cast is OK and maybe simpler.

But I am afraid that the issue still can be triggered in sample-based situation,
especially in very high frequency case: suppose the sample freq is 10000,
100us IRQ delay may trigger the issue.

So we may use the overflow information to make perf more robust, IMO.

thanks
--
Ming Lei



More information about the linux-arm-kernel mailing list