[PATCH 00/30] KVM: arm64: Add support for protected guest memory with pKVM
Pavan Kondeti
pavan.kondeti at oss.qualcomm.com
Mon Apr 20 21:15:47 PDT 2026
On Mon, Apr 20, 2026 at 04:56:57PM +0530, Pavan Kondeti wrote:
> On Mon, Apr 20, 2026 at 11:00:35AM +0100, Will Deacon wrote:
> > On Mon, Apr 20, 2026 at 01:32:03PM +0530, Pavan Kondeti wrote:
> > > Hi Will,
> > >
> > > On Mon, Jan 05, 2026 at 03:49:08PM +0000, Will Deacon wrote:
> > > > Hi folks,
> > > >
> > > > Although pKVM has been shipping in Android kernels for a while now,
> > > > protected guest (pVM) support has been somewhat languishing upstream.
> > > > This has partly been because we've been waiting for guest_memfd() but
> > > > also because it hasn't been clear how to expose pVMs to userspace (which
> > > > is necessary for testing) without getting everything in place beforehand.
> > > > This has led to frustration on both sides of the fence [1] and so this
> > > > patch series attempts to get things moving again by exposing pVM
> > > > features in an incremental fashion based on top of anonymous memory,
> > > > which is what we have been using in Android. The big difference between
> > > > this series and the Android implementation is the graceful handling of
> > > > host stage-2 faults arising from accesses made using kernel mappings.
> > > > The hope is that this will unblock pKVM upstreaming efforts while the
> > > > guest_memfd() work continues to evolve.
> > > >
> > > > Specifically, this patch series implements support for protected guest
> > > > memory with pKVM, where pages are unmapped from the host as they are
> > > > faulted into the guest and can be shared back from the guest using pKVM
> > > > hypercalls. Protected guests are created using a new machine type
> > > > identifier and can be booted to a shell using the kvmtool patches
> > > > available at [2], which finally means that we are able to test the pVM
> > > > logic in pKVM. Since this is an incremental step towards full isolation
> > > > from the host (for example, the CPU register state and DMA accesses are
> > > > not yet isolated), creating a pVM requires a developer Kconfig option to
> > > > be enabled in addition to booting with 'kvm-arm.mode=protected' and
> > > > results in a kernel taint.
> > > >
> > >
> > > Good to see Protected VM support in upstream w/ pKVM.
> > >
> > > We (Qualcomm) have been trying to resume Gunyah upstreaming [1] efforts
> > > for some time but the path to re-use guest_memfd is not straight forward as
> > > guest_memfd is tightly coupled with KVM. While the efforts to use it for
> > > pKVM is pending and refactoring to make it use outside KVM is not
> > > happening anytime soon, we plan to send Gunyah series similar to how
> > > this series is dealt with pages lent/donated to the Guest. Please let us
> > > know if you have any suggestions/comments for us.
> >
> > The major problem I see with this is that the host/hyp interface for
> > handling stage-2 faults is internal to pKVM. The exception is injected
> > back into the host using a funky ESR encoding and the hypercall used
> > to forcefully reclaim the page is not ABI. I have no appetite for
> > standardising these mechanisms (the flexibility is one of pKVM's big
> > advantages) but I also do not want to complicate EL1 fault handling path
> > with hypervisor-specific crap that we have to maintain forever.
>
> Thanks Will for the feedback. Agree that we don't want to sprinkle Gunyah
> specific checks in fault handling code. Do we need to handle anything
> apart from
>
> (a) user space access a memory that is lent to the guest. Gunyah will
> inject Synchronous External Abort and the offending process will be killed.
>
> (b) For kernel access, we need to unmap the memory at S1 while lending
> it to the guest. Earlier, Elliot attempted this with [1]. I am thinking
> we can leverage from Direct map removal work done for guest_memfd w/o
> really using it :-) . I am hoping [2] can be made available for Gunyah
> module as well.
>
> For the (b) problem above, pKVM takes a different route in upstream i.e
> pkvm_force_reclaim_guest_page(). I believe this avoids kernel panic at
> the expense of memory corruption in the guest. correct?
sorry, not memory corruption but returning -EFAULT to VCPU_RUN ioctl
which probably make VMM to kill the VM.
Thanks,
Pavan
More information about the linux-arm-kernel
mailing list