[PATCH 2/2] soc: qcom: add l2 cache perf events driver

Neil Leeder nleeder at codeaurora.org
Fri Jun 10 15:34:04 PDT 2016



On 6/9/2016 03:41 PM, Peter Zijlstra wrote:
> On Thu, Jun 09, 2016 at 04:56:16PM +0100, Mark Rutland wrote:
>>>>> +static irqreturn_t l2_cache__handle_irq(int irq_num, void *data)
>>>>> +{
>>>>> +	struct hml2_pmu *slice = data;
>>>>> +	u32 ovsr;
>>>>> +	int idx;
>>>>> +	struct pt_regs *regs;
>>>>> +
>>>>> +	ovsr = hml2_pmu__getreset_ovsr();
>>>>> +	if (!hml2_pmu__has_overflowed(ovsr))
>>>>> +		return IRQ_NONE;
>>>>> +
>>>>> +	regs = get_irq_regs();
>>>>> +
>>>>> +	for (idx = 0; idx < l2cache_pmu.num_counters; idx++) {
>>>>> +		struct perf_event *event = slice->events[idx];
>>>>> +		struct hw_perf_event *hwc;
>>>>> +		struct perf_sample_data data;
>>>>> +
>>>>> +		if (!event)
>>>>> +			continue;
>>>>> +
>>>>> +		if (!hml2_pmu__counter_has_overflowed(ovsr, idx))
>>>>> +			continue;
>>>>> +
>>>>> +		l2_cache__event_update_from_slice(event, slice);
>>>>> +		hwc = &event->hw;
>>>>> +
>>>>> +		if (is_sampling_event(event)) {
>>>>> +			perf_sample_data_init(&data, 0, hwc->last_period);
>>>>
>>>> I don't think sampling makes sense, given this is an uncore PMU and the
>>>> events are triggered by other CPUs.
>>>
>>> There is origin filtering so events can be attributed to a CPU when sampling.
>>
>> Ok. I believe that's different from all other uncore PMUs we support
>> (none of the drivers support sampling, certainly), so I'm not entirely
>> sure how/if we can make use of that sanely and reliably.
> 
> Right; because not only do you need to know which CPU originated the
> event, the IRQ must also happen on that CPU. Simply knowing which CPU
> triggered it is not enough for sampling.
> 
>> For the timebeing, I think this sampling needs to go, and the event_init
>> logic needs to reject sampling as with other uncore PMU drivers.
> 
> Agreed.

I want to make sure I understand what the concern is here.
Given the hardware filter which restricts counting to events generated by
a specific CPU, and an irq which is affine to that CPU, sampling and task mode
would seem to work for a single perf use. 
Is the issue only related to multiple concurrent perf uses?

> 
>> One thing I forgot to mention in my earlier comments is that as an
>> uncore PMU you need to have task_ctx_nr = perf_invalid_context here
>> also.
> 
> For good reasons, uncore PMUs (as is the case here) count strictly more
> than the effect of single CPUs (and thus also the current task). So
> attributing it back to a task is nonsense.
> 

Thanks,
Neil

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.



More information about the linux-arm-kernel mailing list