[RFC PATCH v2 0/7] iommu/arm-smmu-v3: Use pinned KVM VMID for stage 2

Jason Gunthorpe jgg at ziepe.ca
Fri Feb 9 04:48:29 PST 2024


On Fri, Feb 09, 2024 at 11:58:24AM +0000, Jean-Philippe Brucker wrote:

> * Stage-1 TLB entries in the SMMU have a bit (ASET) saying "this entry
>   is private and does not participate in BTM", which we set for private
>   SMMU address spaces.
> 
>   Annoyingly, the stage-2 TLB entries do not have it. With BTM all VMIDs
>   are shared between CPU and SMMU.

Right, the spec justified this decision like this:

 Note: Arm expects that SMMU stage 2 address spaces are generally
 shared with their respective PE virtual machine stage 2
 configuration. If broadcast invalidation is required to be avoided
 for a particular SMMU stage 2 address space, Arm recommends that a
 hypervisor configures the STE with a VMID that is not allocated for
 virtual machine use on the PEs.

Which doesn't match how Linux works and I think after the recent KVM
PUCK call on this topic we can say it does not match how Linux will
work going into the future. Assuming the KVM S2 and IOMMU S2 are
shared is not true.

So unfortuntely this creates a waste as the BTM will generate
worthless invalidation workload on the related IOMMU S2. We cannot do
as the spec suggests to avoid broadcast invalidation here with a
unique VMID as that will break vSVA vBTM invalidation to the S1.

I do wonder how much of a performance negative this will create. At
least the S1 isn't flushed so perhaps the performace hit is
small.

Anyhow, I view it as a defect that the HW doesn't have a BTM ignore
bit at the S2 level so that we can use the same VMID to make vBTM work
but not participate in CPU originated invalidations for a non-shared
S2. A global bit to disable S2 BTM would have been fine for Linux.

>   - The old VFIO_TYPE1_NESTING_IOMMU lets userspace allocate a private
>     stage-2, and has only been used for testing as far as I know. I don't
>     think I ever found a program that used it in the wild, but haven't
>     checked recently.

There isn't, it is useless and cannot do anything. A patch has been
waiting to remove it for a while now, I've got it in my part 3 right
now.

>     It needs to be deprecated over a few releases (starting with a
>     warning maybe?),

No need, we just NOP'd it. It has no user visible side effect,
replacing the S2 with a S1 is fine.

>     and the replacement API shouldn't allow creating a
>     stage-2 without a KVM context.

See my remarks yesterday.

Jason



More information about the linux-arm-kernel mailing list