[PATCH rc v2] iommu/arm-smmu-v3: Do not use GFP_KERNEL under as spinlock

Michael Shavit mshavit at google.com
Sat Feb 17 04:25:36 PST 2024


On Fri, Feb 16, 2024 at 8:36 PM Jason Gunthorpe <jgg at nvidia.com> wrote:
>
> On Fri, Feb 16, 2024 at 12:05:12PM +0000, Will Deacon wrote:
> > On Thu, Feb 15, 2024 at 10:56:57AM -0400, Jason Gunthorpe wrote:
> > > If the SMMU is configured to use a two level CD table then
> > > arm_smmu_write_ctx_desc() allocates a CD table leaf internally using
> > > GFP_KERNEL. Due to recent changes this is being done under a spinlock to
> > > iterate over the device list - thus it will trigger a sleeping while
> > > atomic warning:
> > >
> > >   arm_smmu_sva_set_dev_pasid()
> > >     mutex_lock(&sva_lock);
> > >     __arm_smmu_sva_bind()
> > >      arm_smmu_mmu_notifier_get()
> > >       spin_lock_irqsave()
> > >       arm_smmu_write_ctx_desc()
> > >     arm_smmu_get_cd_ptr()
> > >          arm_smmu_alloc_cd_leaf_table()
> > >       dmam_alloc_coherent(GFP_KERNEL)
> > >
> > > This is a 64K high order allocation and really should not be done
> > > atomically.
> > >
> > > At the moment the rework of the SVA to follow the new API is half
> > > finished. Recently the CD table memory was moved from the domain to the
> > > master, however we have the confusing situation where the SVA code is
> > > wrongly using the RID domains device's list to track which CD tables the
> > > SVA is installed in.
> > >
> > > Remove the logic to replicate the CD across all the domain's masters
> > > during attach. We know which master and which CD table the PASID should be
> > > installed in.
> > >
> > > At the moment SVA is only invoked when dma-iommu.c is in control of the
> > > RID translation, which means we have a single iommu_domain shared across
> > > the entire group and that iommu_domain is not shared outside the group.
> > >
> > > For PCI cases the core code also insists on singleton groups so there is
> > > only ever one entry in the smmu_domain->domains list that is equal to the
> > > master being passed in to arm_smmu_sva_set_dev_pasid().
> > >
> > > Only non-PCI cases may have multi-device groups. However, the core code it
> > > self will replicate the calls to arm_smmu_sva_set_dev_pasid() across the
> > > entire group so we will still correctly install the CD into each group
> > > members master.
> >
> > Are you sure about this paragraph? arm_smmu_mmu_notifier_get() will return
> > early if it finds an existing notifier in the 'mmu_notifiers' list for the
> > domain, so I don't think we'll actually get as far as installing the CD,
> > will we?
>
> I think the paragraph is the right analysis, the code just isn't
> listening very well..
>
> Lifting up the arm_smmu_write_ctx_desc() into the caller will fix it.

Calling arm_smmu_write_ctx_desc requires the CD which we get from the
mmu_notifiers list...which makes it a bit more complicated than that.
But it does sound doable with some work (perhaps keeping it in
arm_smmu_mmu_notifier_get for now and changing the early return
logic?).

>
> Also Michael should look at it (I recall we talked about this once)
> and Nicolin should test it.


Just to make sure I follow why we're pursuing this instead of the v1
rc patch: in the non-PCI multidevice group scenario, the first call to
set_dev_pasid would only have pre-allocated for the current master but
arm_smmu_mmu_notifier_get would then still write
arm_smmu_write_ctx_desc to other masters?

>
> BTW, I have no idea if non-PCI cases exists, everyone I know is doing
> PCI SVA.
>
> Jason



More information about the linux-arm-kernel mailing list