[PATCH 2/2] KVM: arm64: Ensure TLBI uses correct VMID after changing context
Will Deacon
will at kernel.org
Thu Aug 15 05:08:03 PDT 2024
Hey Marc,
Thanks for having a look.
On Wed, Aug 14, 2024 at 02:30:53PM +0100, Marc Zyngier wrote:
> On Wed, 14 Aug 2024 13:34:29 +0100,
> Will Deacon <will at kernel.org> wrote:
> >
> > When the target context passed to enter_vmid_context() matches the
> > current running context, the function returns early without manipulating
> > the registers of the stage-2 MMU. This can result in a stale VMID due to
> > the lack of an ISB instruction in exit_vmid_context() after writing the
>
> This refers to the exit_vmid_context() *before* a subsequent
> enter_vmid_context(), right?
Yes!
> > VTTBR when ARM64_WORKAROUND_SPECULATIVE_AT is not enabled.
>
> I'm not sure I completely get the failure mode, so please bear with me.
Of course, it's confused me a tonne of times and writing the commit
message was a real pain.
> > For example, with pKVM enabled:
> >
> > // Initially running in host context
> > enter_vmid_context(guest);
> > -> __load_stage2(guest); isb // Writes VTCR & VTTBR
> > exit_vmid_context(guest);
> > -> __load_stage2(host); // Restores VTCR & VTTBR
> >
> > enter_vmid_context(host);
> > -> Returns early as we're already in host context
>
> Is this the code you are referring to?
>
> vcpu = host_ctxt->__hyp_running_vcpu;
>
> [...]
>
> if (vcpu) {
> /*
> * We're in guest context. However, for this to work, this needs
> * to be called from within __kvm_vcpu_run(), which ensures that
> * __hyp_running_vcpu is set to the current guest vcpu.
> */
> if (mmu == vcpu->arch.hw_mmu || WARN_ON(mmu != host_s2_mmu))
> return;
>
> cxt->mmu = vcpu->arch.hw_mmu;
> } else {
> /* We're in host context. */
> if (mmu == host_s2_mmu)
> return;
>
> cxt->mmu = host_s2_mmu;
> }
Yes, this is the "early return" where we don't bother touching the MMU
because we're already in the correct context.
> So I take it that no vcpu is running at this point, and that we have
> done a vcpu_put().
>
> Is there an actual path within pKVM that causes a guest TLBI to be
> followed by a host __kvm_tlb_flush_vmid() *without* a CSE? I can't
> convinced myself that such a path exist in the current upstream code
> base.
I think you're right that this can't happen upstream. We see the issue
in Android when reclaiming pages from a guest during teardown. That
amounts to unmapping pages from the guest, poisoning them and mapping
them back into the host. Mapping them into the host can then trigger
table -> block conversion and the associated TLB invalidation wasn't
effective because it was still using the guest VMID.
We can carry this patch in Android if you prefer, but given that
{enter,exit}_vmid_context() are upstream, it would be nice to land the
fix so that we don't run into this bug again in future (it took some
debugging!).
Cheers,
Will
More information about the linux-arm-kernel
mailing list