[PATCH] iommu/arm-smmu: Pretty-print context fault related regs
Robin Murphy
robin.murphy at arm.com
Mon Jun 17 06:07:03 PDT 2024
On 04/06/2024 4:01 pm, Rob Clark wrote:
> From: Rob Clark <robdclark at chromium.org>
>
> Parse out the bitfields for easier-to-read fault messages.
>
> Signed-off-by: Rob Clark <robdclark at chromium.org>
> ---
> Stephen was wanting easier to read fault messages.. so I typed this up.
>
> Resend with the new iommu list address
>
> drivers/iommu/arm/arm-smmu/arm-smmu.c | 53 +++++++++++++++++++++++++--
> drivers/iommu/arm/arm-smmu/arm-smmu.h | 5 +++
> 2 files changed, 54 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index c572d877b0e1..06712d73519c 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -411,6 +411,8 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
> unsigned long iova;
> struct arm_smmu_domain *smmu_domain = dev;
> struct arm_smmu_device *smmu = smmu_domain->smmu;
> + static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
> + DEFAULT_RATELIMIT_BURST);
> int idx = smmu_domain->cfg.cbndx;
> int ret;
>
> @@ -425,10 +427,53 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
> ret = report_iommu_fault(&smmu_domain->domain, NULL, iova,
> fsynr & ARM_SMMU_FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ);
>
> - if (ret == -ENOSYS)
> - dev_err_ratelimited(smmu->dev,
> - "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cbfrsynra=0x%x, cb=%d\n",
> - fsr, iova, fsynr, cbfrsynra, idx);
> + if (ret == -ENOSYS && __ratelimit(&rs)) {
> + static const struct {
> + u32 mask; const char *name;
> + } fsr_bits[] = {
> + { ARM_SMMU_FSR_MULTI, "MULTI" },
> + { ARM_SMMU_FSR_SS, "SS" },
> + { ARM_SMMU_FSR_UUT, "UUT" },
> + { ARM_SMMU_FSR_ASF, "ASF" },
> + { ARM_SMMU_FSR_TLBLKF, "TLBLKF" },
> + { ARM_SMMU_FSR_TLBMCF, "TLBMCF" },
> + { ARM_SMMU_FSR_EF, "EF" },
> + { ARM_SMMU_FSR_PF, "PF" },
> + { ARM_SMMU_FSR_AFF, "AFF" },
> + { ARM_SMMU_FSR_TF, "TF" },
> + }, fsynr0_bits[] = {
> + { ARM_SMMU_FSYNR0_WNR, "WNR" },
> + { ARM_SMMU_FSYNR0_PNU, "PNU" },
> + { ARM_SMMU_FSYNR0_IND, "IND" },
> + { ARM_SMMU_FSYNR0_NSATTR, "NSATTR" },
> + { ARM_SMMU_FSYNR0_PTWF, "PTWF" },
> + { ARM_SMMU_FSYNR0_AFR, "AFR" },
> + };
> +
> + pr_err("%s %s: Unhandled context fault: fsr=0x%x (",
> + dev_driver_string(smmu->dev), dev_name(smmu->dev), fsr);
> +
> + for (int i = 0, n = 0; i < ARRAY_SIZE(fsr_bits); i++) {
> + if (fsr & fsr_bits[i].mask) {
> + pr_cont("%s%s", (n > 0) ? "|" : "", fsr_bits[i].name);
Given that SMMU faults have a high likelihood of correlating with other
errors, e.g. the initiating device also reporting that it got an abort
back, this much pr_cont is a recipe for an unreadable mess. Furthermore,
just imagine how "helpful" this would be when faults in two contexts are
reported by two different CPUs at the same time ;)
I'd prefer to retain the original message as-is, so there is at least
still an unambiguous "atomic" view of a fault's entire state, then
follow it with a decode more in the style of arm64's ESR logging. TBH I
also wouldn't disapprove of hiding the additional decode behind a
command-line/runtime parameter, since a fault storm can cripple a system
enough as it is, without making the interrupt handler spend even longer
printing to a potentially slow console.
> + n++;
> + }
> + }
> +
> + pr_cont("), iova=0x%08lx, fsynr=0x%x (S1CBNDX=%u", iova, fsynr,
> + (fsynr >> 16) & 0xff);
Please define all the bitfields properly (and I agree with Pranjal about
the naming).
Thanks,
Robin.
> +
> + for (int i = 0; i < ARRAY_SIZE(fsynr0_bits); i++) {
> + if (fsynr & fsynr0_bits[i].mask) {
> + pr_cont("|%s", fsynr0_bits[i].name);
> + }
> + }
> +
> + pr_cont("|PLVL=%u), cbfrsynra=0x%x, cb=%d\n",
> + fsynr & 0x3, /* FSYNR0.PLV */
> + cbfrsynra, idx);
> +
> + }
>
> arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_FSR, fsr);
> return IRQ_HANDLED;
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> index 836ed6799a80..3b051273718b 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> @@ -223,6 +223,11 @@ enum arm_smmu_cbar_type {
>
> #define ARM_SMMU_CB_FSYNR0 0x68
> #define ARM_SMMU_FSYNR0_WNR BIT(4)
> +#define ARM_SMMU_FSYNR0_PNU BIT(5)
> +#define ARM_SMMU_FSYNR0_IND BIT(6)
> +#define ARM_SMMU_FSYNR0_NSATTR BIT(8)
> +#define ARM_SMMU_FSYNR0_PTWF BIT(10)
> +#define ARM_SMMU_FSYNR0_AFR BIT(11)
>
> #define ARM_SMMU_CB_FSYNR1 0x6c
>
More information about the linux-arm-kernel
mailing list