[PATCH rc v2] iommu/arm-smmu-v3: Do not use GFP_KERNEL under as spinlock

Jason Gunthorpe jgg at nvidia.com
Mon Feb 19 16:35:28 PST 2024


On Mon, Feb 19, 2024 at 04:32:40PM +0800, Michael Shavit wrote:
> On Sat, Feb 17, 2024 at 9:24 PM Jason Gunthorpe <jgg at nvidia.com> wrote:
> >
> > On Sat, Feb 17, 2024 at 08:25:36PM +0800, Michael Shavit wrote:
> >
> > > Calling arm_smmu_write_ctx_desc requires the CD which we get from the
> > > mmu_notifiers list...which makes it a bit more complicated than
> > > that.
> >
> > @@ -404,9 +384,15 @@ static int __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
> >                 goto err_free_bond;
> >         }
> >
> > +       ret = arm_smmu_write_ctx_desc(master, pasid, bond->smmu_mn->cd);
> > +       if (ret)
> > +               goto err_put_notifier;
> > +
> >
> Oh hey that's not much more complicated :) . I'm guessing that'll be a
> call to arm_smmu_mmu_notifier_put() rather than a goto on the error
> path?

Yes, error unwind is the goto which does put and kfree(bond)

> Speaking of... the arm_smmu_update_ctx_desc_devices call in
> arm_smmu_mmu_notifier_put might be tricky as well if it's encountered
> on the error path before all devices (in theoretical non-pci case) had
> a chance to previously call arm_smmu_write_ctx_desc.

It is hard to understand but I think it is close enough to OK..

The put will pair with the get and if this is the first member of the
group then it will do a symmetric tear down. The 

	if (!refcount_dec_and_test(&smmu_mn->refs))
		return;

Takes care of that

Otherwise the puts accumulate the refcount back down to zero which
should be hit once __iommu_remove_group_pasid() gets all the group
members removed.

IOW the CDs are not cleaned up on any devices until all the group
members have the PASID removed. Clearly it is not correct design, but
it looks like it works good enough even in error paths.

Then when it does eventually reach arm_smmu_update_ctx_desc_devices()
it wipes all the CDs of all the RID domain's masters which is the same
as the group membership.

Which will end up happening before iommu_attach_device_pasid() returns
on its error path.

(obviously this is all made to work logically and properly in part 2)

Jasson



More information about the linux-arm-kernel mailing list