[PATCH v1 1/4] perf: Allow periodic events to alternate between two sample periods

Peter Zijlstra peterz at infradead.org
Thu Nov 14 07:01:52 PST 2024


On Thu, Nov 07, 2024 at 04:07:18PM +0000, Deepak Surti wrote:
> From: Ben Gainey <ben.gainey at arm.com>
> 
> This change modifies perf_event_attr to add a second, alternative
> sample period field, and modifies the core perf overflow handling
> such that when specified an event will alternate between two sample
> periods.
> 
> Currently, perf does not provide a  mechanism for decoupling the period
> over which counters are counted from the period between samples. This is
> problematic for building a tool to measure per-function metrics derived
> from a sampled counter group. Ideally such a tool wants a very small
> sample window in order to correctly attribute the metrics to a given
> function, but prefers a larger sample period that provides representative
> coverage without excessive probe effect, triggering throttling, or
> generating excessive amounts of data.
> 
> By alternating between a long and short sample_period and subsequently
> discarding the long samples, tools may decouple the period between
> samples that the tool cares about from the window of time over which
> interesting counts are collected.

Do you have a link to a paper or something that explains this method?


> +	/*
> +	 * Indicates that the alternative_sample_period is used
> +	 */
> +	bool				using_alternative_sample_period;

I typically prefer variables names that are shorter.


> @@ -9822,6 +9825,26 @@ static int __perf_event_overflow(struct perf_event *event,
>  	    !bpf_overflow_handler(event, data, regs))
>  		return ret;
>  
> +	/*
> +	 * Swap the sample period to the alternative period
> +	 */
> +	if (event->attr.alternative_sample_period) {
> +		bool using_alt = hwc->using_alternative_sample_period;
> +		u64 sample_period = (using_alt ? event->attr.sample_period
> +					       : event->attr.alternative_sample_period);
> +
> +		hwc->sample_period = sample_period;
> +		hwc->using_alternative_sample_period = !using_alt;
> +
> +		if (local64_read(&hwc->period_left) > 0) {
> +			event->pmu->stop(event, PERF_EF_UPDATE);
> +
> +			local64_set(&hwc->period_left, 0);
> +
> +			event->pmu->start(event, PERF_EF_RELOAD);
> +		}

This is quite terrible :-(

Getting here means we just went through the effort of programming the
period and you'll pretty much always hit that 'period_left > 0' case.

Why do we need this case at all? If you don't do this, then the next
overflow will pick the period you just wrote to hwc->sample_period
(although you might want to audit all arch implementations).

Looking at it again, that truncation to 0 is just plain wrong -- always.
Why are you doing this?






More information about the linux-arm-kernel mailing list