[PATCH v4 06/16] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev

Jason Gunthorpe jgg at nvidia.com
Thu Feb 1 05:24:43 PST 2024


On Thu, Feb 01, 2024 at 12:15:53PM +0000, Mostafa Saleh wrote:
> Hi Jason,
> 
> On Thu, Jan 25, 2024 at 07:57:16PM -0400, Jason Gunthorpe wrote:
> > The BTM support wants to be able to change the ASID of any smmu_domain.
> > When it goes to do this it holds the arm_smmu_asid_lock and iterates over
> > the target domain's devices list.
> > 
> > During attach of a S1 domain we must ensure that the devices list and
> > CD are in sync, otherwise we could miss CD updates or a parallel CD update
> > could push an out of date CD.
> > 
> > This is pretty complicated, and almost works today because
> > arm_smmu_detach_dev() removes the master from the linked list before
> > working on the CD entries, preventing parallel update of the CD.
> > 
> > However, it does have an issue where the CD can remain programed while the
> > domain appears to be unattached. arm_smmu_share_asid() will then not clear
> > any CD entriess and install its own CD entry with the same ASID
> > concurrently. This creates a small race window where the IOMMU can see two
> > ASIDs pointing to different translations.
> 
> I don’t see the race condition.
> 
> The current flow is as follows,
> For SVA, if the asid was used by domain_x, it will do:
> 
> lock(arm_smmu_asid_lock)
> Alloc new asid and set cd->asid.
> lock(domain_x->devices_lock)
> Write new CD with the new asid
> unlock(domain_x->devices_lock)
> unlock(arm_smmu_asid_lock)
> 
> For attach_dev (domain_y), if the device was attached to domain_z
> //Detach old domain
> lock(domain_z->devices_lock)
> Remove master from old domain
> unlock(domain_z->devices_lock)

At this moment all locks are dropped and the RID's CD entry continues
to use the ASID.

The racing BTM flow now runs and will do your above:

arm_smmu_mmu_notifier_get()
 arm_smmu_alloc_shared_cd()
  arm_smmu_share_asid():
    arm_smmu_update_ctx_desc_devices() <<- Does nothing due to list_del above
    arm_smmu_tlb_inv_asid() <<-- Woops, we are invalidating an ASID that is still in a CD!
 arm_smmu_write_ctx_desc() <<-- Install a new translation on a PASID's CD

Now the HW can observe two installed CDs using the same ASID but they
point to different translations. This is illegal.

> Clear CD

Now we remove the RID CD, but it is too late, the PASID CD is already
installed.

ASID/VMID lifecycle must be strictly contained to ensure the cache
remains coherent:

1. All programmed STE/CDs using the ASID/VMID must always point to the
   same translation

2. All references to a ASID/VMID must be removed from their STE/CDs
   before the ASID is flushed

3. The ASID/VMID must be flushed before it is assigned to a STE/CD
   with a new translation.

We solve this by requiring that the arm_smmu_asid_lock must be held
such that the smmu_domains->devices list AND the actual content of the
CD tables are always observed to be consistent.

Jason



More information about the linux-arm-kernel mailing list