[PATCH RFCv1 08/14] iommufd: Add IOMMU_VIOMMU_SET_DEV_ID ioctl

Wed May 29 17:28:43 PDT 2024

> From: Nicolin Chen <nicolinc at nvidia.com>
> Sent: Wednesday, May 29, 2024 11:21 AM
> 
> On Wed, May 29, 2024 at 02:58:11AM +0000, Tian, Kevin wrote:
> > > From: Nicolin Chen <nicolinc at nvidia.com>
> > > Sent: Wednesday, May 29, 2024 4:23 AM
> 
> > > What I mean is for regular vSMMU. Without VCMDQ, a regular vSMMU
> > > on a multi-pSMMU setup will look like (e.g. three devices behind
> > > different SMMUs):
> > > |<------ VMM ------->|<------ kernel ------>|
> > >        |-- viommu0 --|-- pSMMU0 --|
> > > vSMMU--|-- viommu1 --|-- pSMMU1 --|--s2_hwpt
> > >        |-- viommu2 --|-- pSMMU2 --|
> > >
> > > And device would attach to:
> > > |<---- guest ---->|<--- VMM --->|<- kernel ->|
> > >        |-- dev0 --|-- viommu0 --|-- pSMMU0 --|
> > > vSMMU--|-- dev1 --|-- viommu1 --|-- pSMMU1 --|
> > >        |-- dev2 --|-- viommu2 --|-- pSMMU2 --|
> > >
> > > When trapping a device cache invalidation: it is straightforward
> > > by deciphering the virtual device ID to pick the viommu that the
> > > device is attached to.
> >
> > I understand how above works.
> >
> > My question is why that option is chosen instead of going with 1:1
> > mapping between vSMMU and viommu i.e. letting the kernel to
> > figure out which pSMMU should be sent an invalidation cmd to, as
> > how VT-d is virtualized.
> >
> > I want to know whether doing so is simply to be compatible with
> > what VCMDQ requires, or due to another untold reason.
> 
> Because we use viommu as a VMID holder for SMMU. So a pSMMU must
> have its own viommu to store its VMID for a shared s2_hwpt:
>         |-- viommu0 (VMIDx) --|-- pSMMU0 --|
>  vSMMU--|-- viommu1 (VMIDy) --|-- pSMMU1 --|--s2_hwpt
>         |-- viommu2 (VMIDz) --|-- pSMMU2 --|
> 

there are other options, e.g. you can have one viommu holding multiple
VMIDs each associating to a pSMMU.

so it's really an implementation choice that you want a same viommu
topology scheme w/ or w/o VCMDQ.

we all know there are some constraints of copying the physical topology,
e.g. hotplugging a device or migration. for VCMDQ it's clearly an
acceptable tradeoff for performance. w/o VCMDQ I'm not sure whether
you want to keep more flexibility for the user. 😊