[PATCH v7 8/9] arm64: pmu: Detect and enable multiple PMUs in an ACPI system

Fri Aug 26 15:44:59 PDT 2016

Hi,

On 08/26/2016 10:04 AM, Punit Agrawal wrote:
(trimming)
>> +			pmu = kzalloc(sizeof(struct pmu_types), GFP_KERNEL);
>> +			if (!pmu) {
>> +				pr_warn("Unable to allocate pmu_types\n");
>> +				/*
>> +				 * continue to count cpus for any pmu_types
>> +				 * already allocated, but don't allocate any
>> +				 * more pmu_types. This avoids undercounting.
>> +				 */
>> +				alloc_failure = true;
>
> Why not just fail probe and return an error? What is the benefit of
> having some of the PMUs available?

AFAIC, there isn't a good reason for penalizing PMU's which we can get 
working if a subset of the system PMUs can't be created. But this is per 
PMU type, so with current systems the kzalloc will be called a max of 2 
times (there is the potential of a 3rd time, due to some other error 
handling, but that doesn't change the argument much). AKA, this doesn't 
result in "partial registration" of a PMU.

So, really the only question in my mind is does it work if one of the 
allocations fails and the other succeeds, and the answer is yes, the 
remaining interrupts/etc get associated with the correct PMU, and it 
gets created and should work as well as perf currently works in systems 
with heterogeneous PMUs (cue discussion about CPU process migration).

So, since this is early boot, and we are taking a tiny allocation, if 
this fails i suspect that the machine will probably die anyway, not due 
to the choice of whether the PMU is counted properly or not. I would 
guess the platform allocation or similar will die..

(trimming)

>> +/*
>> + * For the given cpu/pmu type, walk all known GSIs, register them, and add
>> + * them to the resource structure. Return the number of GSI's contained
>> + * in the res structure, and the id of the last CPU/PMU we added.
>> + */
>> +static int __init arm_pmu_acpi_gsi_res(struct pmu_types *pmus,
>> +				       struct resource *res, int *last_cpu_id)
>> +{
>> +	int i, count;
>> +	int irq;
>> +
>> +	/* lets group all the PMU's from similar CPU's together */
>> +	count = 0;
>> +	for_each_possible_cpu(i) {
>> +		struct cpuinfo_arm64 *cinfo = per_cpu_ptr(&cpu_data, i);
>> +
>> +		if (pmus->cpu_type == MIDR_PARTNUM(cinfo->reg_midr)) {
>> +			if (pmu_irqs[i].gsi == 0)
>> +				continue;
>
> Please don't silently continue if the irq is missing. It deserves a user
> visible message. We don't want users complaining about kernel issues
> when the firmware fails to provide the required information.

See below.

>
>> +
>> +			irq = acpi_register_gsi(NULL, pmu_irqs[i].gsi,
>> +						pmu_irqs[i].trigger,
>> +						ACPI_ACTIVE_HIGH);
>
> Check for the return value of acpi_register_gsi as it can return an
> error.

Ok, so this is probably a little subtle, but IIRC from a few months back 
when I was testing this/reworking it, duplicate GSI registrations for 
exclusive PPI's result in errors (see any random discussion about PPI's 
not being ACPI GSI's). As the code to handle SPI vs PPI exists in the 
platform code, I decided to ignore registration errors until such a 
determination can be made. AKA, i'm potentially tossing invalid irq's 
into the irq list for PPIs, but it doesn't matter because they are 
temporary ignored. The core ARM PMU code, has a much better handle on 
what is a correct interrupt binding, so the decision about whether these 
failures need to be worried about are delayed until then. This results 
in a large simplification because we handle the irq deregistration 
necessarily for any further errors together, AKA you will notice a 
complete lack of differing code paths for PPI vs SPI in this module.

As far as gsi=0, in cases where there are placeholder GICC entries in 
the MADT (think disabled CPU) then i've seen gsi=0. I'm not sure that is 
incorrect, so again it gets skipped, the pmu mask doesn't have it 
listed, and everything "works". So if i'm going to add a message here 
i'm, going to wrap it in a reg_midr!=0 check.

>
>> +
>> +			res[count].start = res[count].end = irq;
>> +			res[count].flags = IORESOURCE_IRQ;
>> +
>> +			if (pmu_irqs[i].trigger == ACPI_EDGE_SENSITIVE)
>> +				res[count].flags |= IORESOURCE_IRQ_HIGHEDGE;
>> +			else
>> +				res[count].flags |= IORESOURCE_IRQ_HIGHLEVEL;
>> +
>> +			pmu_irqs[i].registered = true;
>> +			count++;
>> +			(*last_cpu_id) = cinfo->reg_midr;
>
> What is the benefit of using the entire MIDR for cpu_id when the
> grouping is done on the basis of a subset, i.e., part number.

Because the platform code isn't partnum specific, unless the the 
ARMV8_PMU_PART_PROBE() macro is utilized. Delaying any decisions about 
what part of the MIDR happens to be used (if any) to the part probe 
table is probably a good idea. Especially if someone happens to think 
that they want a ARMV8_PMU_PART_REVISION_PROBE() or similar macro..

If anything It might be worth removing the partnum checks in this module 
as well. That way should someone ship a machine with the same CPU only 
differing by revision (or pick your favorite !partnum field) they get 
different PMUs definitions. Why anyone would do that I cannot guess, but 
I do see that apparently the xscale version was important at one point.