[PATCH v3 26/36] KVM: arm64: Return -EFAULT from VCPU_RUN on access to a poisoned pte
Will Deacon
will at kernel.org
Mon Mar 23 07:58:56 PDT 2026
On Fri, Mar 20, 2026 at 04:35:44PM +0000, Marc Zyngier wrote:
> On Thu, 05 Mar 2026 14:43:39 +0000,
> Will Deacon <will at kernel.org> wrote:
> > diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > index 4ff31947579b..7f705f662c40 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > @@ -890,6 +890,49 @@ static int get_valid_guest_pte(struct pkvm_hyp_vm *vm, u64 ipa, kvm_pte_t *ptep,
> > return 0;
> > }
> >
> > +int __pkvm_vcpu_in_poison_fault(struct pkvm_hyp_vcpu *hyp_vcpu)
> > +{
> > + struct pkvm_hyp_vm *vm = pkvm_hyp_vcpu_to_hyp_vm(hyp_vcpu);
> > + kvm_pte_t pte;
> > + s8 level;
> > + u64 ipa;
> > + int ret;
> > +
> > + switch (kvm_vcpu_trap_get_class(&hyp_vcpu->vcpu)) {
> > + case ESR_ELx_EC_DABT_LOW:
> > + case ESR_ELx_EC_IABT_LOW:
> > + if (kvm_vcpu_trap_is_translation_fault(&hyp_vcpu->vcpu))
> > + break;
> > + fallthrough;
> > + default:
> > + return -EINVAL;
> > + }
> > +
> > + /*
> > + * The host has the faulting IPA when it calls us from the guest
> > + * fault handler but we retrieve it ourselves from the FAR so as
> > + * to avoid exposing an "oracle" that could reveal data access
> > + * patterns of the guest after initial donation of its pages.
> > + */
> > + ipa = kvm_vcpu_get_fault_ipa(&hyp_vcpu->vcpu);
> > + ipa |= kvm_vcpu_get_hfar(&hyp_vcpu->vcpu) & GENMASK(11, 0);
>
> nit: we now have FAR_TO_FIPA_OFFSET() for this.
Neat, I'll use that. Thanks.
> > diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c
> > index 32294bd21dde..da0a45dab203 100644
> > --- a/arch/arm64/kvm/pkvm.c
> > +++ b/arch/arm64/kvm/pkvm.c
> > @@ -417,10 +417,13 @@ int pkvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size,
> > return -EINVAL;
> >
> > /*
> > - * We raced with another vCPU.
> > + * We either raced with another vCPU or the guest PTE
> > + * has been poisoned by an erroneous host access.
> > */
> > - if (mapping)
> > - return -EAGAIN;
> > + if (mapping) {
> > + ret = kvm_call_hyp_nvhe(__pkvm_vcpu_in_poison_fault);
> > + return ret ? -EFAULT : -EAGAIN;
> > + }
>
> I guess this considers that racing against another vcpu is an unlikely
> situation, because calling back into EL2 and walking the PTs isn't
> exactly cheap.
Yeah, I wanted to avoid walking the stage-2 page-table at EL2 on every
fault, so it ends up being deferred to here in the case that we find an
existing mapping for the faulting IPA.
> I wonder if there is a mechanism we could use to directly return this
> information to the host at the point of the guest fault. The only
> things I can figure out would require the PTE to be valid (access or
> permission faults, for example), and that'd break the "full PTE
> dedicated to annotations"...
Oh, I see what you mean... using the fault type as a proxy feels like it
probably won't scale so well if we ever want to use those faults for
anything else.
If we want to optimise the common case, perhaps I could set a flag in
the host kvm structure (from EL2) when the page is poisoned in
__pkvm_host_force_reclaim_page_guest() and then check that here? In that
case, only VMs that have had a page forcefully-reclaimed will issue the
hypercall. There's a race, but I think it's ok because we'll get -EAGAIN
and pick up the flag the next time around.
WDYT? It might be premature optimisation, but it also feels do-able?
Will
More information about the linux-arm-kernel
mailing list