[PATCH v1 14/14] iommu/arm-smmu-v3: Add arm_smmu_cache_invalidate_user

Jason Gunthorpe jgg at nvidia.com
Fri Mar 24 07:44:40 PDT 2023


On Fri, Mar 24, 2023 at 08:47:20AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg at nvidia.com>
> > Sent: Tuesday, March 21, 2023 7:49 PM
> > 
> > On Tue, Mar 21, 2023 at 08:34:00AM +0000, Tian, Kevin wrote:
> > 
> > > > > Rephrasing that to put into a design: the IOCTL would pass a
> > > > > user pointer to the queue, the size of the queue, then a head
> > > > > pointer and a tail pointer? Then the kernel reads out all the
> > > > > commands between the head and the tail and handles all those
> > > > > invalidation commands only?
> > > >
> > > > Yes, that is one possible design
> > >
> > > If we cannot have the short path in the kernel then I'm not sure the
> > > value of using native format and queue in the uAPI. Batching can
> > > be enabled over any format.
> > 
> > SMMUv3 will have a hardware short path where the HW itself runs the
> > VM's command queue and does this logic.
> > 
> > So I like the symmetry of the SW path being close to that.
> > 
> 
> Out of curiosity. VCMDQ is per SMMU. Does it imply that Qemu needs
> to create multiple vSMMU instances if devices assigned to it are behind
> different physical SMMUs (plus one instance specific for emulated
> devices), to match VCMDQ with a specific device?

Yes

> btw is VCMDQ in standard SMMU spec or a NVIDIA specific extension?
> If the latter does it require extra changes in guest smmu driver?

It is a mash up of ARM standard ECMDQ with a few additions. I hope ARM
will standardize something someday

> The symmetry of the SW path has another merit beyond performance.
> It allows live migration falling back to the sw short-path with not-so-bad
> overhead when the dest machine cannot afford the same number of
> VCMDQ's as the src.

Well, that requires SW emulation of the VCMDQ thing, but yes
 
> But still the main open for in-kernel short-path is what would be the
> framework to move part of vIOMMU emulation into the kernel. If this
> can be done cleanly then it's better than vhost-iommu which lacks
> behind significantly regarding to advanced features. But if it cannot
> be done cleanly leaving each vendor move random emulation logic
> into the kernel then vhost-iommu sounds more friendly to the kernel
>  though lots of work remains to fill the feature gap.

I assume there are reasonable ways to hook the kernel to kvm, vhost
does it. I've never looked at it. At worst we need to factor some of
the vhost code into some library to allow it.

We want a kernel thread to wakeup on a doorbell ring basically.

Jason



More information about the linux-arm-kernel mailing list