[RFC] ARM vGIC-ITS tables serialization when running protected VMs
David Woodhouse
dwmw2 at infradead.org
Tue Apr 15 02:44:39 PDT 2025
On Tue, 2025-04-15 at 09:35 +0100, Marc Zyngier wrote:
> On Mon, 14 Apr 2025 12:12:43 +0100,
> Ilias Stamatis <ilstam at amazon.com> wrote:
> >
> > # The problem
> >
> > KVM's ARM Virtual Interrupt Translation Service (ITS) interface supports the
> > KVM_DEV_ARM_ITS_SAVE_TABLES and KVM_DEV_ARM_ITS_RESTORE_TABLES operations.
> > These operations save and restore a set of tables (Device Tables, Interrupt
> > Translation Tables, Collection Table) to and from guest memory.
> >
> > This can be a problem when running a protected VM on top of pKVM or another
> > lowvisor since the host kernel (running at EL1) cannot access guest memory.
> >
>
> pKVM doesn't allow a guest to be saved/restored, full stop.
Yet. Either it's going to need to learn to support live update, or
it'll remain a toy solution.
> > # Page declassification and why ITTs are special
> >
> > The Collection and Device tables are page aligned and their sizes must be a
> > multiple of page size. If the lowvisor knows where these tables live, it is
> > possible to "declassify" the corresponding pages and configure the MMU such as
> > that the EL1 host can write to guest memory directly.
> >
> > The ITTs (Interrupt Translation Tables) are different. They are NOT page
> > aligned, they are 256 byte aligned and their size is variable. That means that
> > the lowvisor can't declassify pages containing ITTs and configure the MMU
> > giving the host direct access as above since those pages may contain unrelated
> > data.
>
> And it is the responsibility of the guest to make these page aligned
> if it intend to let the hypervisor use them. To sum it up, the ITT
> isn't special at all.
The ITT has nothing to do with virtualization, does it? And despite
this being logically "DMA", I don't believe it's possible to advertise
it as being behind the SMMU, which would have allowed for access
control (and would indeed have meant that the guest would be expected
to grant access to full pages).
What exactly are you suggesting? That the GIC specification should be
changed to require page alignment, or to document that in a
confidential compute setup, the remainder of any page which contains
ITTs will be implicitly made non-confidential and shared with the
hypervisor?
And then the lowvisor would also have to snoop the ITS command queues
to even find out which pages to implicitly allow access to?
> >
> > If the lowvisor knows where the ITTs live in guest memory it could instead
> > perform the guest memory accesses on behalf of the host. I.e. the EL1 host
> > would attempt to save the ITTs to guest memory like it does today, that would
> > generate a data abort, and then the EL2 lowvisor could perform the copy after
> > validating that the faulty address belongs to an ITT in guest memory.
> >
> > One issue with the above is that the ITS save/restore happens at hypervisor
> > live update which is a time sensitive operation and the extra traps (one per
> > interrupt mapping?) can introduce significant additional overhead
> > there.
>
> I don't believe this for a second.
You don't believe that every millisecond of live update downtime,
perceived by the guest as unwanted steal time of a hypervisor that's
generally trying to be as quiescent as possible, is an issue?
> >
> > Another issue is that it's actually hard for the lowvisor to know where these
> > tables live without trusting the EL1 host which virtualizes the ITS. It is
> > especially hard knowing the locations of the ITTs (compared to
> > Collection/Device tables) because that probably means having to parse the ITS
> > command queue from EL2 which is complex and undesirable.
> >
> > # An alternative: Serializing ITTs into a userspace buffer
>
> NAK.
>
> Share the page-aligned memory with the rest of the hypervisor, and use
> the existing API.
That seems like a bad choice. All this is just using guest memory to
store KVM's state.
Yes, the guest provides a buffer which the virtual hardware *may* use
if it wants, but with no IOMMU or access control defined in the
specification.
It seems like it would be much cleaner just to let KVM pass its state
up to userspace for serialization like we do for all *other* KVM state,
which is what Ilias is proposing.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5069 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20250415/b5f66897/attachment.p7s>
More information about the linux-arm-kernel
mailing list