[PATCH v2 5/5] iommu/arm-smmu: Convert to domain_alloc_paging()
Jason Gunthorpe
jgg at nvidia.com
Mon Feb 12 16:19:34 PST 2024
On Tue, Feb 13, 2024 at 01:18:24AM +0200, Dmitry Baryshkov wrote:
> > > For some reason this patch breaks booting of the APQ8096 Dragonboard820c
> > > (qcom/apq8096-db820c.dts). Dispbling display subsystem (mdss) and venus
> > > devices makes the board boot in most of the cases. Most frequently the
> > > last parts of the log loog in a following way:
> >
> > It is surprising we tested this patch on some tegra systems with this
> > iommu and didn't hit anything..
> >
> > The only real functional thing this changes is to move the domain
> > initialization up in time, potentially a lot in time in some
> > cases. That function does alot of things including touching HW so
> > possibly there is some surprising interaction with something else.
> >
> > So, I would expect this to not WARN_ON and to make it work the same as
> > before the patch:
> >
> > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > @@ -875,7 +875,9 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
> > mutex_init(&smmu_domain->init_mutex);
> > spin_lock_init(&smmu_domain->cb_lock);
> >
> > - if (dev) {
> > + WARN_ON(using_legacy_binding);
> > +
> > +/* if (dev) {
> > struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
> >
> > if (arm_smmu_init_domain_context(smmu_domain, cfg->smmu, dev)) {
> > @@ -883,7 +885,7 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
> > return NULL;
> > }
> > }
> > -
> > +*/
> > return &smmu_domain->domain;
> > }
>
> With the full device tree, this crashed during the IOMMU probe, no warnings:
The above reverts nearly all the functional elements of the patch you
said caused the problem, are you certain of your bisection?
> > And then we may get a clue from the backtraces it generates. I only
> > saw one iommu group reported in your log so I'd expect one trace?
>
> With the full device tree, same result:
This adds basically an unconditional WARN_ON on all the probe paths,
and nothing printed? That is even more surprising.
Those two together suggest that arm_smmu_domain_alloc_paging() isn't
even being called. But the caller is:
if (alloc_type == IOMMU_DOMAIN_IDENTITY && ops->identity_domain)
return ops->identity_domain;
else if (alloc_type == IOMMU_DOMAIN_BLOCKED && ops->blocked_domain)
return ops->blocked_domain;
else if (type & __IOMMU_DOMAIN_PAGING && ops->domain_alloc_paging)
domain = ops->domain_alloc_paging(dev);
else if (ops->domain_alloc)
domain = ops->domain_alloc(alloc_type);
else
return ERR_PTR(-EOPNOTSUPP);
Which, suggest that alloc_type is somehow garbage for your system? I
don't see how that can happen, but this patch will also cause a
garbage type to be rejected by the core code.
Does this reveal anything for you?
@@ -2118,6 +2118,7 @@ static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops,
struct iommu_domain *domain;
unsigned int alloc_type = type & IOMMU_DOMAIN_ALLOC_FLAGS;
+ WARN(true, " __iommu_domain_alloc %u %u", alloc_type, type);
if (alloc_type == IOMMU_DOMAIN_IDENTITY && ops->identity_domain)
return ops->identity_domain;
else if (alloc_type == IOMMU_DOMAIN_BLOCKED && ops->blocked_domain)
@@ -2126,8 +2127,10 @@ static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops,
domain = ops->domain_alloc_paging(dev);
else if (ops->domain_alloc)
domain = ops->domain_alloc(alloc_type);
- else
+ else {
+ printk("Returning failure from __iommu_domain_alloc()\n");
return ERR_PTR(-EOPNOTSUPP);
+ }
/*
* Many domain_alloc ops now return ERR_PTR, make things easier for the
It must print, something is wrong with the debugging process if this
doesn't generate back traces :\
Jason
More information about the linux-arm-kernel
mailing list