[RFC PATCH 00/45] KVM: Arm SMMUv3 driver for pKVM

tina.zhang tina.zhang at intel.com
Sat Feb 4 04:30:17 PST 2023



On 2/4/23 16:19, Chen, Jason CJ wrote:
> Hi, Jean,
> 
> Thanks for the information! Let's do more investigation.
> 
> Yes, if using enlighten method, we may skip nested translation. Meantime we
> shall ensure host not touch this capability. We may also need trade-off to support
> SVM kind features.
Hi Jason,

Nested translation is also optional to vt-d. Not all IA platforms could 
have vt-d with nested translation support. For those legacy platforms 
(e.g. on which vt-d doesn't support scalable mode), providing an 
enlightened way for pKVM to isolate DMA seems reasonable. Otherwise, 
pKVM may need to shadow io-page table which could introduce performance 
overhead.


Regards,
-Tina
> 
> Thanks
> 
> Jason
> 
>> -----Original Message-----
>> From: Jean-Philippe Brucker <jean-philippe at linaro.org>
>> Sent: Friday, February 3, 2023 7:24 PM
>> To: Chen, Jason CJ <jason.cj.chen at intel.com>
>> Cc: Tian, Kevin <kevin.tian at intel.com>; maz at kernel.org;
>> catalin.marinas at arm.com; will at kernel.org; joro at 8bytes.org;
>> robin.murphy at arm.com; james.morse at arm.com;
>> suzuki.poulose at arm.com; oliver.upton at linux.dev; yuzenghui at huawei.com;
>> smostafa at google.com; dbrazdil at google.com; ryan.roberts at arm.com;
>> linux-arm-kernel at lists.infradead.org; kvmarm at lists.linux.dev;
>> iommu at lists.linux.dev; Zhang, Tina <tina.zhang at intel.com>
>> Subject: Re: [RFC PATCH 00/45] KVM: Arm SMMUv3 driver for pKVM
>>
>> Hi Jason,
>>
>> On Fri, Feb 03, 2023 at 08:39:41AM +0000, Chen, Jason CJ wrote:
>>>>>> btw some of my colleagues are porting pKVM to Intel platform. I
>>>>>> believe they will post their work shortly and there might
>>>>>> require some common framework in pKVM hypervisor like iommu
>>>>>> domain, hypercalls, etc. like what we have in the host iommu
>>>>>> subsystem. CC them in case of any early thought they want to
>>>>>> throw in. 😊
>>>>>
>>>>> Cool! The hypervisor part contains iommu/iommu.c which deals with
>>>>> hypercalls and domains and doesn't contain anything specific to
>>>>> Arm (it's only in arch/arm64 because that's where pkvm currently
>>>>> sits). It does rely on io-pgtable at the moment which is not used
>>>>> by VT-d but that can be abstracted as well. It's possible however
>>>>> that on Intel an entirely different set of hypercalls will be
>>>>> needed, if a simpler solution such as sharing page tables fits
>>>>> better because VT-d implementations are more homogeneous.
>>>>>
>>>>
>>>> yes depending on the choice on VT-d there could be different degree
>>>> of the sharing possibility. I'll let Jason/Tina comment on their design
>> choice.
>>>
>>> Thanks Kevin bring us here. Current our POC solution for VT-d is based
>>> on nested translation, as there are two level io-pgtable, we keep
>>> first-level page table full controlled by host VM (IOVA -> host_GPA)
>>> and second-level page table is managed by pKVM (host_GPA -> HPA). This
>>> solution is simple straight-forward, but pKVM still need to provide
>>> vIOMMU emulation for host (e.g., shadowing root/context/ pasid tables,
>> emulating IOTLB flush etc.).
>>
>> I dismissed emulating the SMMU early on because it feels too complex
>> compared to an abstracted hypercall interface, but again that may be due to
>> the high variation of configurations of the SMMU. For nesting, you could use
>> some of the interface that Yi Liu and Jacob Pan have been working on [1]. It
>> should be possible with a couple of attach-table and tlb-invalidate hypercalls
>> to avoid emulating the low-level registers and queues.
>>
>>> As I know, SMMU also support nested translation mode, may I know
>>> what's the mode used for pKVM?
>>
>> It doesn't use nested translation because it is optional in the SMMU, and this
>> series tries to support any possible implementation. Since pKVM on
>> arm64 is being used on mobile platforms I suspect that, to save space, some
>> SMMUs might not implement first-level or second-level page tables.
>> Besides, supporting nesting for Arm would still require hypercalls for pinning
>> DMA pages (solution 2).
>>
>> This series populates the second-level tables with the complete IOVA -> PA
>> translation (similarly to how VFIO works at the moment). If an
>> implementation only supports first-level tables, then the hypervisor would
>> own it and put the IOVA -> PA translation in there.
>>
>> Thanks,
>> Jean
>>
>> [1] https://lore.kernel.org/linux-iommu/1570045363-24856-2-git-send-email-
>> jacob.jun.pan at linux.intel.com/
>>      (It's being reworked but I couldn't find a recent link)
>>
>>>
>>> We met similar solution choices whether to share second-level
>>> io-pgtable with CPU pgtable,  and finally we also decided to introduce
>>> a new pgtable, this increase the complexity of page state management -
>>> as io-pgtable & cpu-pgtable need to align the page ownership.
>>>
>>> Now our solution is based on vIOMMU emulation in pKVM, enlighten
>>> method should also be an alternative solution.
>>>
>>> Thanks
>>> Jason CJ Chen
> 



More information about the linux-arm-kernel mailing list