[PATCH v5 00/27] Update SMMUv3 to the modern iommu API (part 2/3)
Mostafa Saleh
smostafa at google.com
Mon Mar 25 03:22:15 PDT 2024
Hi Jason,
On Mon, Mar 04, 2024 at 07:43:48PM -0400, Jason Gunthorpe wrote:
> Continuing the work of part 1 this focuses on the CD, PASID and SVA
> components:
>
> - attach_dev failure does not change the HW configuration.
>
> - Full PASID API support including:
> - S1/SVA domains attached to PASIDs
> - IDENTITY/BLOCKED/S1 attached to RID
> - Change of the RID domain while PASIDs are attached
>
> - Streamlined SVA support using the core infrastructure
>
> - Hitless, whenever possible, change between two domains
>
> Making the CD programming work like the new STE programming allows
> untangling some of the confusing SVA flows. From there the focus is on
> building out the core infrastructure for dealing with PASID and CD
> entries, then keeping track of unique SSID's for ATS invalidation.
>
> The ATS ordering is generalized so that the PASID flow can use it and put
> into a form where it is fully hitless, whenever possible. Care is taken to
> ensure that ATC flushes are present after any change in translation.
>
> Finally we simply kill the entire outdated SVA mmu_notifier implementation
> in one shot and switch it over to the newly created generic PASID & CD
> code. This avoids the messy and confusing approach of trying to
> incrementally untangle this in place. The new code is small and simple
> enough this is much better than trying to figure out smaller steps.
>
> Once SVA is resting on the right CD code it is straightforward to make the
> PASID interface functionally complete.
>
> It achieves the same goals as the several series from Michael and the S1DSS
> series from Nicolin that were trying to improve portions of the API.
>
> This is on github:
> https://github.com/jgunthorpe/linux/commits/smmuv3_newapi
Testing on qemu[1], with the same VMM Shameer tested with[2]:
qemu/build/qemu-system-aarch64 -M virt -machine virt,gic-version=3,iommu=nested-smmuv3,iommufd=iommufd0 \
-cpu cortex-a53,pmu=off -smp 1 -m 2048 \
-kernel Image \
-drive file=rootfs.ext4,if=virtio,format=raw \
-object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0 -nographic \
-append 'console=ttyAMA0 rootwait root=/dev/vda' \
-device virtio-scsi-pci,id=scsi0 \
-device ioh3420,id=pcie.1,chassis=1 \
-object iommufd,id=iommufd0 \
-device vfio-pci,host=0000:00:03.0,iommufd=iommufd0
I see the following panic:
[ 155.141233] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 155.142416] Mem abort info:
[ 155.142722] ESR = 0x0000000086000004
[ 155.143106] EC = 0x21: IABT (current EL), IL = 32 bits
[ 155.143827] SET = 0, FnV = 0
[ 155.144266] EA = 0, S1PTW = 0
[ 155.144721] FSC = 0x04: level 0 translation fault
[ 155.145432] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000101059000
[ 155.146234] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[ 155.148162] Internal error: Oops: 0000000086000004 [#1] PREEMPT SMP
[ 155.149399] Modules linked in:
[ 155.150366] CPU: 2 PID: 371 Comm: qemu-system-aar Not tainted 6.8.0-rc7-gde77230ac23a #9
[ 155.151728] Hardware name: linux,dummy-virt (DT)
[ 155.152770] pstate: 81400809 (Nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=-c)
[ 155.153895] pc : 0x0
[ 155.154889] lr : iommufd_hwpt_invalidate+0xa4/0x204
[ 155.156272] sp : ffff800080f3bcc0
[ 155.156971] x29: ffff800080f3bcf0 x28: ffff0000c369b300 x27: 0000000000000000
[ 155.158135] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[ 155.159175] x23: 0000000000000000 x22: 00000000c1e334a0 x21: ffff0000c1e334a0
[ 155.160343] x20: ffff800080f3bd38 x19: ffff800080f3bd58 x18: 0000000000000000
[ 155.161298] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff8240d6d8
[ 155.162355] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[ 155.163463] x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
[ 155.164947] x8 : 0000001000000002 x7 : 0000fffeac1ec950 x6 : 0000000000000000
[ 155.166057] x5 : ffff800080f3bd78 x4 : 0000000000000003 x3 : 0000000000000002
[ 155.167343] x2 : 0000000000000000 x1 : ffff800080f3bcc8 x0 : ffff0000c6034d80
[ 155.168851] Call trace:
[ 155.169738] 0x0
[ 155.170623] iommufd_fops_ioctl+0x154/0x274
[ 155.171555] __arm64_sys_ioctl+0xac/0xf0
[ 155.172095] invoke_syscall+0x48/0x110
[ 155.172633] el0_svc_common.constprop.0+0x40/0xe0
[ 155.173277] do_el0_svc+0x1c/0x28
[ 155.173847] el0_svc+0x34/0xb4
[ 155.174312] el0t_64_sync_handler+0x120/0x12c
[ 155.174969] el0t_64_sync+0x190/0x194
[ 155.176006] Code: ???????? ???????? ???????? ???????? (????????)
[ 155.178349] ---[ end trace 0000000000000000 ]---
The core IOMMUFD code calls domain->ops->cache_invalidate_user
unconditionally from IOCTL:IOMMU_HWPT_INVALIDATE and the SMMUv3 driver
doesn't implement it, that seems missing as otherwise the VMM can't
invalidate S1 mappings, or I a missing something?
[1] https://lore.kernel.org/all/20240325101442.1306300-1-smostafa@google.com/
[2] https://github.com/nicolinc/qemu/commits/wip/iommufd_vsmmu-02292024/
>
> v5:
> - Rebase on v6.8-rc7 & Will's tree
> - Accomdate the SVA rc patch removing the master list iteration
> - Move the kfree(to_smmu_domain(domain)) hunk to the right patch
> - Move S1DSS get_used hunk to "Allow IDENTITY/BLOCKED to be set while
> PASID is used"
> v4: https://lore.kernel.org/r/0-v4-e7091cdd9e8d+43b1-smmuv3_newapi_p2_jgg@nvidia.com
> - Rebase on v6.8-rc1, adjust to use mm_get_enqcmd_pasid() and eventually
> remove all references from ARM. Move the new ARM_SMMU_FEAT_STALL_FORCE
> stuff to arm_smmu_make_sva_cd()
> - Adjust to use the new shared STE/CD writer logic. Disable some of the
> sanity checks for the interior of the series
> - Return ERR_PTR from domain_alloc functions
> - Move the ATS disablement flow into arm_smmu_attach_prepare()/commit()
> which lets all the STE update flows use the same sequence. This is
> needed for nesting in part 3
> - Put ssid in attach_state
> - Replace to_smmu_domain_safe() with to_smmu_domain_devices()
> v3: https://lore.kernel.org/r/0-v3-9083a9368a5c+23fb-smmuv3_newapi_p2_jgg@nvidia.com
> - Rebase on the latest part 1
> - update comments and commit messages
> - Fix error exit in arm_smmu_set_pasid()
> - Fix inverted logic for btm_invalidation
> - Add missing ATC invalidation on mm release
> - Add a big comment explaining that BTM is not enabled and what is
> missing to enable it.
> v2: https://lore.kernel.org/r/0-v2-16665a652079+5947-smmuv3_newapi_p2_jgg@nvidia.com
> - Rebased on iommmufd + Joerg's tree
> - Use sid_smmu_domain consistently to refer to the domain attached to the
> device (eg the PCIe RID)
> - Rework how arm_smmu_attach_*() and callers flow to be more careful
> about ordering around ATC invalidation. The ATC must be invalidated
> after it is impossible to establish stale entires.
> - ATS disable is now entirely part of arm_smmu_attach_dev_ste(), which is
> the only STE type that ever disables ATS.
> - Remove the 'existing_master_domain' optimization, the code is
> functionally fine without it.
> - Whitespace, spelling, and checkpatch related items
> - Fixed wrong value stored in the xa for the BTM flows
> - Use pasid more consistently instead of id
> v1: https://lore.kernel.org/r/0-v1-afbb86647bbd+5-smmuv3_newapi_p2_jgg@nvidia.com
>
> Jason Gunthorpe (27):
> iommu/arm-smmu-v3: Do not allow a SVA domain to be set on the wrong
> PASID
> iommu/arm-smmu-v3: Do not ATC invalidate the entire domain
> iommu/arm-smmu-v3: Add a type for the CD entry
> iommu/arm-smmu-v3: Add an ops indirection to the STE code
> iommu/arm-smmu-v3: Make CD programming use arm_smmu_write_entry()
> iommu/arm-smmu-v3: Consolidate clearing a CD table entry
> iommu/arm-smmu-v3: Move the CD generation for S1 domains into a
> function
> iommu/arm-smmu-v3: Move allocation of the cdtable into
> arm_smmu_get_cd_ptr()
> iommu/arm-smmu-v3: Allocate the CD table entry in advance
> iommu/arm-smmu-v3: Move the CD generation for SVA into a function
> iommu/arm-smmu-v3: Build the whole CD in arm_smmu_make_s1_cd()
> iommu/arm-smmu-v3: Start building a generic PASID layer
> iommu/arm-smmu-v3: Make smmu_domain->devices into an allocated list
> iommu/arm-smmu-v3: Make changing domains be hitless for ATS
> iommu/arm-smmu-v3: Add ssid to struct arm_smmu_master_domain
> iommu/arm-smmu-v3: Keep track of valid CD entries in the cd_table
> iommu/arm-smmu-v3: Thread SSID through the arm_smmu_attach_*()
> interface
> iommu/arm-smmu-v3: Make SVA allocate a normal arm_smmu_domain
> iommu/arm-smmu-v3: Keep track of arm_smmu_master_domain for SVA
> iommu: Add ops->domain_alloc_sva()
> iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain
> iommu/arm-smmu-v3: Consolidate freeing the ASID/VMID
> iommu/arm-smmu-v3: Move the arm_smmu_asid_xa to per-smmu like vmid
> iommu/arm-smmu-v3: Bring back SVA BTM support
> iommu/arm-smmu-v3: Allow IDENTITY/BLOCKED to be set while PASID is
> used
> iommu/arm-smmu-v3: Allow a PASID to be set when RID is
> IDENTITY/BLOCKED
> iommu/arm-smmu-v3: Allow setting a S1 domain to a PASID
>
> .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 639 +++++-----
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1036 +++++++++++------
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 79 +-
> drivers/iommu/iommu-sva.c | 4 +-
> drivers/iommu/iommu.c | 12 +-
> include/linux/iommu.h | 3 +
> 6 files changed, 1024 insertions(+), 749 deletions(-)
>
>
> base-commit: 98b23ebb0c84657a135957d727eedebd1280cbbf
> --
> 2.43.2
>
More information about the linux-arm-kernel
mailing list