[PATCH] perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count)

Robin Murphy robin.murphy at arm.com
Tue Feb 6 02:00:19 PST 2024


On 2024-02-05 7:46 pm, Ilkka Koskinen wrote:
> AmpereOneX mesh implementation has a bug in HN-P nodes that makes them
> report incorrect child count. The failing crosspoints report 8 children
> while they only have two.

Ooh, fun :)

> When the driver tries to access the inexistent child nodes, it believes it
> has reached an invalid node type and probing fails. The workaround is to
> ignore those incorrect child nodes and continue normally.
> 
> Signed-off-by: Ilkka Koskinen <ilkka at os.amperecomputing.com>
> ---
>   drivers/perf/arm-cmn.c | 25 +++++++++++++++++++++++++
>   1 file changed, 25 insertions(+)
> 
> diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c
> index c584165b13ba..97fed8ec3693 100644
> --- a/drivers/perf/arm-cmn.c
> +++ b/drivers/perf/arm-cmn.c
> @@ -2168,6 +2168,23 @@ static enum cmn_node_type arm_cmn_subtype(enum cmn_node_type type)
>   	}
>   }
>   
> +static inline bool arm_cmn_is_ampereonex_bug(const struct arm_cmn *cmn,
> +					     struct arm_cmn_node *dn,
> +					     u16 child_count, int child)
> +{
> +	/*
> +	 * The bug occurs only when a crosspoint reports 8 children
> +	 * while it only has two HN-P child nodes.
> +	 */
> +	dn -= 2;
> +
> +	if (arm_cmn_model(cmn) == CMN650 && child_count == 8 &&
> +	    child == 2 && dn->type == CMN_TYPE_HNP)
> +		return true;
> +
> +	return false;
> +}
> +
>   static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
>   {
>   	void __iomem *cfg_region;
> @@ -2292,6 +2309,14 @@ static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
>   
>   		for (j = 0; j < child_count; j++) {
>   			reg = readq_relaxed(xp_region + child_poff + j * 8);
> +			if (reg == 0)
> +				if (arm_cmn_is_ampereonex_bug(cmn, dn, child_count, j))
> +					/*
> +					 * We know there are only two real children and the rest 6
> +					 * are inexistent. Thus, we can skip the rest of the loop
> +					 */
> +					break;
> +

TBH I don't see much harm in taking an even simpler approach, so I'd be
inclined to not bother being all that specific beyond documenting it,
something like the below:

Cheers,
Robin.

----->8-----

diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c
index c584165b13ba..7e3aa7e2345f 100644
--- a/drivers/perf/arm-cmn.c
+++ b/drivers/perf/arm-cmn.c
@@ -2305,6 +2305,17 @@ static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
  				dev_dbg(cmn->dev, "ignoring external node %llx\n", reg);
  				continue;
  			}
+			/*
+			 * AmpereOneX erratum AC04_MESH_1 makes some XPs report a bogus
+			 * child count larger than the number of valid child pointers.
+			 * A child offset of 0 can only occur on CMN-600; otherwise it
+			 * would imply the root node being its own grandchild, which
+			 * we can safely dismiss in general.
+			 */
+			if (reg == 0 && cmn->part != PART_CMN600) {
+				dev_dbg(cmn->dev, "bogus child pointer?\n");
+				continue;
+			}
  
  			arm_cmn_init_node_info(cmn, reg & CMN_CHILD_NODE_ADDR, dn);
  



More information about the linux-arm-kernel mailing list