[PATCH v3 25/36] KVM: arm64: Reclaim faulting page from pKVM in spurious fault handler

Mon Mar 23 07:58:16 PDT 2026

On Sat, Mar 21, 2026 at 09:39:13AM +0000, Marc Zyngier wrote:
> On Fri, 20 Mar 2026 16:20:59 +0000,
> Marc Zyngier <maz at kernel.org> wrote:
> > 
> > On Thu, 05 Mar 2026 14:43:38 +0000,
> > Will Deacon <will at kernel.org> wrote:
> > > 
> > > Host kernel accesses to pages that are inaccessible at stage-2 result in
> > > the injection of a translation fault, which is fatal unless an exception
> > > table fixup is registered for the faulting PC (e.g. for user access
> > > routines). This is undesirable, since a get_user_pages() call could be
> > > used to obtain a reference to a donated page and then a subsequent
> > > access via a kernel mapping would lead to a panic().
> > > 
> > > Rework the spurious fault handler so that stage-2 faults injected back
> > > into the host result in the target page being forcefully reclaimed when
> > > no exception table fixup handler is registered.
> > 
> > Is there any reason why you prefer the 'inject fault' followed by
> > 'gimme that page' dance over a more direct 'unconditionally reclaim
> > the page on the back of the fault'?
> > 
> > I can't figure out what would go wrong in the latter approach, as you
> > always have an opportunity to inject a (fatal) fault if you can't
> > safely reclaim the page.
> 
> To be clear, the reason I'm asking this is that with RME (whether
> that's with the dreaded CCA or something else), injecting a fault into
> the host would involve EL3, and I don't trust EL3 to do that. There is
> also the small detail that the fault syndrome is not strictly
> architectural, meaning that EL3 would have to learn a pKVM-specific
> behaviour.
>
> But EL3 should be able to report a GPC fault to RL-EL2, which then
> could act the exact same way as pKVM, unmapping, clearing and
> releasing the page.

I wonder if we should try to get this changed/extended, given that it's
all software? Perhaps RL-EL2 could populate parts of the fault syndrome
that get injected back into the host?

Ideally, we'd run something at RL-EL2 that is tightly-coupled with KVM
and so we wouldn't need to teach EL3 anything as long as it lets the
two talk to each other...

> Thoughts?

I think we need to retain the exception injection behaviour, for two
reasons:

1. The kernel can over-read strings via load_unaligned_zeropad(). If
   this happens to walk into a protected page, we have to inject a fault
   so that the exception handler can fix things up. Otherwise, we'd
   silently poison protected pages due to software-speculative accesses.

2. If the fault comes directly from userspace (e.g. because of an
   out-of-bounds access or a virtio problem), the fault handler in the
   kernel will inject a signal back to userspace which can be caught and
   handled synchronously. I suppose we could require the VMM to make the
   guest memory PROT_NONE if it wanted to preserve this behaviour
   (assuming it doesn't break GUP). Still, it feels desirable to report
   the problem synchronously when we can.

Will