[PATCH v5 05/11] arm-cci PMU: Delay counter writes to pmu_enable
Mark Rutland
mark.rutland at arm.com
Mon Jan 4 11:24:01 PST 2016
On Mon, Jan 04, 2016 at 11:54:44AM +0000, Suzuki K. Poulose wrote:
> Delay setting the event periods for enabled events to pmu::pmu_enable().
> We mark the event.hw->state PERF_HES_ARCH for the events that we know
> have their counts recorded and have been started.
Please add a comment to the code stating exactly what PERF_HES_ARCH
means for the CCI PMU driver, so it's easy to find.
> Since we reprogram the counters every time before count, we can set
> the counters for all the event counters which are !STOPPED && ARCH.
>
> Grouping the writes to counters can ammortise the cost of the operation
> on PMUs where it is expensive (e.g, CCI-500).
>
> Cc: Mark Rutland <mark.rutland at arm.com>
> Cc: Punit Agrawal <punit.agrawal at arm.com>
> Cc: Peter Zijlstra <peterz at infradead.org>
> Signed-off-by: Suzuki K. Poulose <suzuki.poulose at arm.com>
> ---
> drivers/bus/arm-cci.c | 42 ++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 40 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/bus/arm-cci.c b/drivers/bus/arm-cci.c
> index 0189f3a..c768ee4 100644
> --- a/drivers/bus/arm-cci.c
> +++ b/drivers/bus/arm-cci.c
> @@ -916,6 +916,40 @@ static void hw_perf_event_destroy(struct perf_event *event)
> }
> }
>
> +/*
> + * Program the CCI PMU counters which have PERF_HES_ARCH set
> + * with the event period and mark them ready before we enable
> + * PMU.
> + */
> +void cci_pmu_update_counters(struct cci_pmu *cci_pmu)
> +{
> + int i;
> + unsigned long mask[BITS_TO_LONGS(cci_pmu->num_cntrs)];
I think this can be:
DECLARE_BITMAP(mask, cci_pmu->num_cntrs);
> +
> + memset(mask, 0, BITS_TO_LONGS(cci_pmu->num_cntrs) * sizeof(unsigned long));
Likewise:
bitmap_zero(mask, cci_pmu->num_cntrs);
> +
> + for_each_set_bit(i, cci_pmu->hw_events.used_mask, cci_pmu->num_cntrs) {
> + struct hw_perf_event *hwe;
> +
> + if (!cci_pmu->hw_events.events[i]) {
> + WARN_ON(1);
> + continue;
> + }
> +
if (WARN_ON(!cci_pmu->hw_events.events[i]))
continue;
> + hwe = &cci_pmu->hw_events.events[i]->hw;
> + /* Leave the events which are not counting */
> + if (hwe->state & PERF_HES_STOPPED)
> + continue;
> + if (hwe->state & PERF_HES_ARCH) {
> + set_bit(i, mask);
> + hwe->state &= ~PERF_HES_ARCH;
> + local64_set(&hwe->prev_count, CCI_CNTR_PERIOD);
> + }
> + }
> +
> + pmu_write_counters(cci_pmu, mask, CCI_CNTR_PERIOD);
> +}
> +
> static void cci_pmu_enable(struct pmu *pmu)
> {
> struct cci_pmu *cci_pmu = to_cci_pmu(pmu);
> @@ -927,6 +961,7 @@ static void cci_pmu_enable(struct pmu *pmu)
> return;
>
> raw_spin_lock_irqsave(&hw_events->pmu_lock, flags);
> + cci_pmu_update_counters(cci_pmu);
> __cci_pmu_enable();
> raw_spin_unlock_irqrestore(&hw_events->pmu_lock, flags);
>
> @@ -980,8 +1015,11 @@ static void cci_pmu_start(struct perf_event *event, int pmu_flags)
> /* Configure the counter unless you are counting a fixed event */
> if (!pmu_fixed_hw_idx(cci_pmu, idx))
> pmu_set_event(cci_pmu, idx, hwc->config_base);
> -
> - pmu_event_set_period(event);
> + /*
> + * Mark this counter, so that we can program the
> + * counter with the event_period. see cci_pmu_enable()
> + */
> + hwc->state = PERF_HES_ARCH;
Why couldn't we have kept pmu_event_set_period here, and have that set
prev_count and PERF_HES_ARCH?
Then we'd be able to do the same betching for overflow too.
What am I missing?
Mark.
> pmu_enable_counter(cci_pmu, idx);
>
> raw_spin_unlock_irqrestore(&hw_events->pmu_lock, flags);
> --
> 1.7.9.5
>
More information about the linux-arm-kernel
mailing list