[PATCH v5 00/17] Update SMMUv3 to the modern iommu API (part 1/3)

Nicolin Chen nicolinc at nvidia.com
Tue Feb 6 21:27:22 PST 2024


On Tue, Feb 06, 2024 at 11:12:37AM -0400, Jason Gunthorpe wrote:
> The SMMUv3 driver was originally written in 2015 when the iommu driver
> facing API looked quite different. The API has evolved, especially lately,
> and the driver has fallen behind.
> 
> This work aims to bring make the SMMUv3 driver the best IOMMU driver with
> the most comprehensive implementation of the API. After all parts it
> addresses:
> 
>  - Global static BLOCKED and IDENTITY domains with 'never fail' attach
>    semantics. BLOCKED is desired for efficient VFIO.
> 
>  - Support map before attach for PAGING iommu_domains.
> 
>  - attach_dev failure does not change the HW configuration.
> 
>  - Fully hitless transitions between IDENTITY -> DMA -> IDENTITY.
>    The API has IOMMU_RESV_DIRECT which is expected to be
>    continuously translating.
> 
>  - Safe transitions between PAGING -> BLOCKED, do not ever temporarily
>    do IDENTITY. This is required for iommufd security.
> 
>  - Full PASID API support including:
>     - S1/SVA domains attached to PASIDs
>     - IDENTITY/BLOCKED/S1 attached to RID
>     - Change of the RID domain while PASIDs are attached
> 
>  - Streamlined SVA support using the core infrastructure
> 
>  - Hitless, whenever possible, change between two domains
> 
>  - iommufd IOMMU_GET_HW_INFO, IOMMU_HWPT_ALLOC_NEST_PARENT, and
>    IOMMU_DOMAIN_NESTED support
> 
> Over all these things are going to become more accessible to iommufd, and
> exposed to VMs, so it is important for the driver to have a robust
> implementation of the API.
> 
> The work is split into three parts, with this part largely focusing on the
> STE and building up to the BLOCKED & IDENTITY global static domains.
> 
> The second part largely focuses on the CD and builds up to having a common
> PASID infrastructure that SVA and S1 domains equally use.
> 
> The third part has some random cleanups and the iommufd related parts.
> 
> Overall this takes the approach of turning the STE/CD programming upside
> down where the CD/STE value is computed right at a driver callback
> function and then pushed down into programming logic. The programming
> logic hides the details of the required CD/STE tear-less update. This
> makes the CD/STE functions independent of the arm_smmu_domain which makes
> it fairly straightforward to untangle all the different call chains, and
> add news ones.
> 
> Further, this frees the arm_smmu_domain related logic from keeping track
> of what state the STE/CD is currently in so it can carefully sequence the
> correct update. There are many new update pairs that are subtly introduced
> as the work progresses.
> 
> The locking to support BTM via arm_smmu_asid_lock is a bit subtle right
> now and patches throughout this work adjust and tighten this so that it is
> clearer and doesn't get broken.
> 
> Once the lower STE layers no longer need to touch arm_smmu_domain we can
> isolate struct arm_smmu_domain to be only used for PAGING domains, audit
> all the to_smmu_domain() calls to be only in PAGING domain ops, and
> introduce the normal global static BLOCKED/IDENTITY domains using the new
> STE infrastructure. Part 2 will ultimately migrate SVA over to use
> arm_smmu_domain as well.
> 
> All parts are on github:
> 
>  https://github.com/jgunthorpe/linux/commits/smmuv3_newapi
> 
> v5:
>  - Rebase on v6.8-rc3
>  - Remove the writer argument to arm_smmu_entry_writer_ops get_used()
>  - Swap order of hweight tests so one call to hweight8() can be removed
>  - Add STRTAB_STE_2_S2VMID used for STRTAB_STE_0_CFG_S1_TRANS, for
>    S2 bypass the VMID is used but 0
>  - Be more exact when generating STEs and store 0's to document the HW
>    is using that value and 0 is actually a deliberate choice for VMID and
>    SHCFG.
>  - Remove cd_table argument to arm_smmu_make_cdtable_ste()
>  - Put arm_smmu_rmr_install_bypass_ste() after setting up a 2 level table
>  - Pull patch "Check that the RID domain is S1 in SVA" from part 2 to
>    guard against memory corruption on failure paths
>  - Tighten the used logic for SHCFG to accommodate nesting patches in
>    part 3
>  - Additional comments and commit message adjustments

I have retested this v5 alone with SVA cases and system sanity.

I also did similar tests with part-2 in the "smmuv3_newapi" branch,
plus adding "iommu.passthrough=y" string to cover the S1DSS.BYPASS
use case.

After that, I retested the entire branch including part-3 with a
nested-smmu VM, to cover different STE configurations.

All results look good.

Tested-by: Nicolin Chen <nicolinc at nvidia.com>



More information about the linux-arm-kernel mailing list