[PATCH] perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count)

Ilkka Koskinen ilkka at os.amperecomputing.com
Tue Feb 6 13:04:27 PST 2024



On Tue, 6 Feb 2024, Robin Murphy wrote:
> On 2024-02-05 7:46 pm, Ilkka Koskinen wrote:
>> AmpereOneX mesh implementation has a bug in HN-P nodes that makes them
>> report incorrect child count. The failing crosspoints report 8 children
>> while they only have two.
>
> Ooh, fun :)
>
>> When the driver tries to access the inexistent child nodes, it believes it
>> has reached an invalid node type and probing fails. The workaround is to
>> ignore those incorrect child nodes and continue normally.
>> 
>> Signed-off-by: Ilkka Koskinen <ilkka at os.amperecomputing.com>
>> ---
>>   drivers/perf/arm-cmn.c | 25 +++++++++++++++++++++++++
>>   1 file changed, 25 insertions(+)
>> 
>> diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c
>> index c584165b13ba..97fed8ec3693 100644
>> --- a/drivers/perf/arm-cmn.c
>> +++ b/drivers/perf/arm-cmn.c
>> @@ -2168,6 +2168,23 @@ static enum cmn_node_type arm_cmn_subtype(enum 
>> cmn_node_type type)
>>   	}
>>   }
>>   +static inline bool arm_cmn_is_ampereonex_bug(const struct arm_cmn *cmn,
>> +					     struct arm_cmn_node *dn,
>> +					     u16 child_count, int child)
>> +{
>> +	/*
>> +	 * The bug occurs only when a crosspoint reports 8 children
>> +	 * while it only has two HN-P child nodes.
>> +	 */
>> +	dn -= 2;
>> +
>> +	if (arm_cmn_model(cmn) == CMN650 && child_count == 8 &&
>> +	    child == 2 && dn->type == CMN_TYPE_HNP)
>> +		return true;
>> +
>> +	return false;
>> +}
>> +
>>   static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
>>   {
>>   	void __iomem *cfg_region;
>> @@ -2292,6 +2309,14 @@ static int arm_cmn_discover(struct arm_cmn *cmn, 
>> unsigned int rgn_offset)
>>     		for (j = 0; j < child_count; j++) {
>>   			reg = readq_relaxed(xp_region + child_poff + j * 8);
>> +			if (reg == 0)
>> +				if (arm_cmn_is_ampereonex_bug(cmn, dn, 
>> child_count, j))
>> +					/*
>> +					 * We know there are only two real 
>> children and the rest 6
>> +					 * are inexistent. Thus, we can skip 
>> the rest of the loop
>> +					 */
>> +					break;
>> +
>
> TBH I don't see much harm in taking an even simpler approach, so I'd be
> inclined to not bother being all that specific beyond documenting it,
> something like the below:

Sounds good to me.

>
> Cheers,
> Robin.
>
> ----->8-----
>
> diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c
> index c584165b13ba..7e3aa7e2345f 100644
> --- a/drivers/perf/arm-cmn.c
> +++ b/drivers/perf/arm-cmn.c
> @@ -2305,6 +2305,17 @@ static int arm_cmn_discover(struct arm_cmn *cmn, 
> unsigned int rgn_offset)
> 				dev_dbg(cmn->dev, "ignoring external node 
> %llx\n", reg);
> 				continue;
> 			}
> +			/*
> +			 * AmpereOneX erratum AC04_MESH_1 makes some XPs 
> report a bogus
> +			 * child count larger than the number of valid child 
> pointers.
> +			 * A child offset of 0 can only occur on CMN-600; 
> otherwise it
> +			 * would imply the root node being its own 
> grandchild, which
> +			 * we can safely dismiss in general.
> +			 */
> +			if (reg == 0 && cmn->part != PART_CMN600) {
> +				dev_dbg(cmn->dev, "bogus child pointer?\n");
> +				continue;
> +			}
>  			arm_cmn_init_node_info(cmn, reg & 
> CMN_CHILD_NODE_ADDR, dn);
>

Tested-by: Ilkka Koskinen <ilkka at os.amperecomputing.com>

Cheers, Ilkka



More information about the linux-arm-kernel mailing list