[PATCH v5 14/27] iommu/arm-smmu-v3: Make changing domains be hitless for ATS

Michael Shavit mshavit at google.com
Thu Mar 21 07:57:29 PDT 2024


On Thu, Mar 21, 2024 at 10:53 PM Michael Shavit <mshavit at google.com> wrote:
>
> On Thu, Mar 21, 2024 at 9:28 PM Jason Gunthorpe <jgg at nvidia.com> wrote:
> >
> > On Thu, Mar 21, 2024 at 08:26:43PM +0800, Michael Shavit wrote:
> > > Overall I think the patch works, but it took me a while to really
> > > digest the big picture. Just to make sure I fully understand:
> > >
> > > We're trying to satisfy the following invariants for correctness:
> > > 1. Devices cannot get translations from a domain that was detached
> > > after arm_smmu_attach_dev() returns.
> >
> > Yes, regardless of ATS this is the API requirement.
> >
> > > 2. Devices cannot get cached translations from a domain after
> > > arm_smmu_atc_inv_domain() returns, regardless of whether the domain is
> > > simultaneously being attached/detached from a device.
> >
> > There is no simultaneously here since the group lock is held by the
> > core code.
>
> I meant a call to arm_smmu_attach_dev concurrent with
> arm_smmu_atc_inv_domain (through tlb_flush_all).
>
> >
> > > Apart from note point 2. above, the behaviour of translations while in
> > > the middle of an arm_smmu_attach_dev doesn't have well defined
> > > requirements:
> > > 1. Before this patch, devices may get translations belonging to the
> > > old domain, then aborts or identity translations, and then the new
> > > domain while arm_smmu_attach_dev() is in progress.
> >
> > Yes
> >
> > > 2. Ater this patch, devices may get an arbitrary mix of translations
> > > belonging to the old domain and the new domain while
> > > arm_smmu_attach_dev() is in progress.
> >
> > Yes, but no aborts.
> >
> > > While disabling and re-enabling ATS inside arm_smmu_attach_dev() would
> > > meet the same requirements as this patch, it's not optimal since the
> > > device may still have traffic on other pasids than the one being
> > > manipulated.
> >
> > Yes, we can't just disable ATS due to the PASIDs. A SVA PASID might be
> > present and disabling ATS to change the RID domain would completely
> > wreck it. Today this scenario is prevented by sva_enable, which is
> > removed in following patches.
>
> What do you mean by wrecking? We might slow it down to a crawl but we
> wouldn't be corrupting or destroying anything by disabling ATS while
> SVA is running would we?
> Oh does disabling ATS abort all transactions rather than making it transparent?
>
> >
> > I added this note for context to the commit message:
> >
> >  This is part of the bigger picture to allow changing the RID domain while
> >  a PASID is in use. If a SVA PASID is relying on ATS to function then
> >  changing the RID domain cannot just temporarily toggle ATS off without
> >  also wrecking the SVA PASID. The new infrastructure here is organized so
> >  that the PASID attach/detach flows will make use of it as well in
> >  following patches.

Huh ok yeah, I thought this was just enabling/disabling caching of
translations in the ATC but looks like it's disabling translation for
those PCIe devices.



More information about the linux-arm-kernel mailing list