[RFC PATCH 0/1] KVM-arm: Optimize cache flush by only flushing on vcpu0

Fri Apr 18 06:10:19 PDT 2025

On Fri, 18 Apr 2025 11:22:43 +0100,
Jiayuan Liang <ljykernel at 163.com> wrote:
> 
> This is an RFC patch to optimize cache flushing behavior in KVM/arm64.
> 
> When toggling cache state in a multi-vCPU guest, we currently flush the VM's
> stage2 page tables on every vCPU that transitions cache state. This leads to
> redundant cache flushes during guest boot, as each vCPU performs the same
> flush operation.
> 
> In a typical guest boot sequence, vcpu0 is the first to enable caches, and
> other vCPUs follow afterward. By the time secondary vCPUs enable their caches,
> the flush performed by vcpu0 has already ensured cache coherency for the
> entire VM.

The most immediate issue I can spot is that vcpu0 is not special.
There is nothing that says vcpu0 will be the first switching its MMU
on, nor that vcpu0 will ever be running. I guess what you would want
instead is that the *first* vcpu that enables its MMU performs the
CMOs, while the others may not have to.

But even then, this changes a behaviour some guests *may* be relying
on, which is that what they have written while their MMU was off is
visible with the MMU on, without the guest doing any CMO of its own.

A lot of this stuff comes from the days where we were mostly running
32bit guests, some of which had (and still have) pretty bad
assumptions (set/way operations being one of them).

64bit guests *should* be much better behaved, and I wonder whether we
could actually drop the whole thing altogether for those. Something
like the hack below.

But this requires testing and more thought than I'm prepared to on a
day off... ;-)

Thanks,

	M.

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index bd020fc28aa9c..9d05e65433916 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -85,9 +85,11 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
 	 * For non-FWB CPUs, we trap VM ops (HCR_EL2.TVM) until M+C
 	 * get set in SCTLR_EL1 such that we can detect when the guest
 	 * MMU gets turned on and do the necessary cache maintenance
-	 * then.
+	 * then. Limit this dance to 32bit guests, assuming that 64bit
+	 * guests are reasonably behaved.
 	 */
-	if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB))
+	if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB) &&
+	    vcpu_el1_is_32bit(vcpu))
 		vcpu->arch.hcr_el2 |= HCR_TVM;
 }
 

-- 
Jazz isn't dead. It just smells funny.