[PATCH v2 00/19] iommufd: Add VIOMMU infrastructure (Part-1)

Yi Liu yi.l.liu at intel.com
Fri Sep 27 05:12:20 PDT 2024


On 2024/9/27 14:32, Nicolin Chen wrote:
> On Fri, Sep 27, 2024 at 01:54:45PM +0800, Yi Liu wrote:
>>>>> Baolu told me that Intel may have the same: different domain IDs
>>>>> on different IOMMUs; multiple IOMMU instances on one chip:
>>>>> https://lore.kernel.org/linux-iommu/cf4fe15c-8bcb-4132-a1fd-b2c8ddf2731b@linux.intel.com/
>>>>> So, I think we are having the same situation here.
>>>>
>>>> yes, it's called iommu unit or dmar. A typical Intel server can have
>>>> multiple iommu units. But like Baolu mentioned in that thread, the intel
>>>> iommu driver maintains separate domain ID spaces for iommu units, which
>>>> means a given iommu domain has different DIDs when associated with
>>>> different iommu units. So intel side is not suffering from this so far.
>>>
>>> An ARM SMMU has its own VMID pool as well. The suffering comes
>>> from associating VMIDs to one shared parent S2 domain.
>>
>> Is this because of the VMID is tied with a S2 domain?
> 
> On ARM, yes. VMID is a part of S2 domain stuff.
> 
>>> Does a DID per S1 nested domain or parent S2? If it is per S2,
>>> I think the same suffering applies when we share the S2 across
>>> IOMMU instances?
>>
>> per S1 I think. The iotlb efficiency is low as S2 caches would be
>> tagged with different DIDs even the page table is the same. :)
> 
> On ARM, the stage-1 is tagged with an ASID (Address Space ID)
> while the stage-2 is tagged with a VMID. Then an invalidation
> for a nested S1 domain must require the VMID from the S2. The
> ASID may be also required if the invalidation is specific to
> that address space (otherwise, broadcast per VMID.)
Looks like the nested s1 caches are tagged with both ASID and VMID.

> I feel these two might act somehow similarly to the two DIDs
> during nested translations?

not quite the same. Is it possible that the ASID is the same for stage-1?
Intel VT-d side can have the pasid to be the same. Like the gIOVA, all
devices use the same ridpasid. Like the scenario I replied to Baolu[1],
do er choose to use different DIDs to differentiate the caches for the
two devices.

[1] 
https://lore.kernel.org/linux-iommu/4bc9bd20-5aae-440d-84fd-f530d0747c23@intel.com/

>>>>> Adding another vIOMMU wrapper on the other hand can allow us to
>>>>> allocate different VMIDs/DIDs for different IOMMUs.
>>>>
>>>> that looks like to generalize the association of the iommu domain and the
>>>> iommu units?
>>>
>>> A vIOMMU is a presentation/object of a physical IOMMU instance
>>> in a VM.
>>
>> a slice of a physical IOMMU. is it?
> 
> Yes. When multiple nested translations happen at the same time,
> IOMMU (just like a CPU) is shared by these slices. And so is an
> invalidation queue executing multiple requests.
> 
> Perhaps calling it a slice sounds more accurate, as I guess all
> the confusion comes from the name "vIOMMU" that might be thought
> to be a user space object/instance that likely holds all virtual
> stuff like stage-1 HWPT or so?

yeah. Maybe this confusion partly comes when you start it with the
cache invalidation as well. I failed to get why a S2 hwpt needs to
be part of the vIOMMU obj at the first glance.

> 
>> and you treat S2 hwpt as a resource of the physical IOMMU as well.
> 
> Yes. A parent HWPT (in the old day, we called it "kernel-manged"
> HWPT) is not a user space thing. This belongs to a kernel owned
> object.
> 
>>> This presentation gives a VMM some capability to take
>>> advantage of some of HW resource of the physical IOMMU:
>>> - a VMID is a small HW reousrce to tag the cache;
>>> - a vIOMMU invalidation allows to access device cache that's
>>>     not straightforwardly done via an S1 HWPT invalidation;
>>> - a virtual device presentation of a physical device in a VM,
>>>     related to the vIOMMU in the VM, which contains some VM-level
>>>     info: virtual device ID, security level (ARM CCA), and etc;
>>> - Non-PRI IRQ forwarding to the guest VM;
>>> - HW-accelerated virtualization resource: vCMDQ, AMD VIOMMU;
>>
>> might be helpful to draw a diagram to show what the vIOMMU obj contains.:)
> 
> That's what I plan to. Basically looks like:
>    device---->stage1--->[ viommu [s2_hwpt, vmid, virq, HW-acc, etc.] ]

ok. let's see your new doc.

-- 
Regards,
Yi Liu



More information about the linux-arm-kernel mailing list