[PATCH] KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb

Mon Jun 8 13:55:25 PDT 2026

On Mon, Jun 08, 2026 at 09:11:08AM +0100, Marc Zyngier wrote:
> Sashiko reports that there is a race between initialising vncr_tlb
> and making use of it, as we don't hold the mmu_lock at this point.
> 
> Additionally, it identifies a memory leak, should userspace repeatedly
> invokes the KVM_RUN ioctl after a failure of kvm_arch_vcpu_run_pid_change(),
> as we assign vncr_tlb blindly on first run, irrespective of prior
> allocations.
> 
> Slap the two bugs in one go by taking the kvm->mmu_lock on assigning
> vncr_tlb, preventing the race for good, and by checking that vncr_tlb
> is indeed NULL prior to allocation.
> 
> Reported-by: Sashiko <sashiko-bot at kernel.org>
> Signed-off-by: Marc Zyngier <maz at kernel.org>
> Link: https://lore.kernel.org/r/20260607180815.85FBC1F00893@smtp.kernel.org
> ---
>  arch/arm64/kvm/nested.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index 690b8e8564166..d11e36b3cfcc2 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
> @@ -1253,8 +1253,14 @@ int kvm_vcpu_allocate_vncr_tlb(struct kvm_vcpu *vcpu)
>  	if (!kvm_has_feat(vcpu->kvm, ID_AA64MMFR4_EL1, NV_frac, NV2_ONLY))
>  		return 0;
>  
> -	vcpu->arch.vncr_tlb = kzalloc_obj(*vcpu->arch.vncr_tlb,
> -					  GFP_KERNEL_ACCOUNT);
> +	if (!vcpu->arch.vncr_tlb) {
> +		struct vncr_tlb *vt = kzalloc_obj(*vcpu->arch.vncr_tlb,
> +						  GFP_KERNEL_ACCOUNT);
> +
> +		scoped_guard(write_lock, &vcpu->kvm->mmu_lock)
> +			vcpu->arch.vncr_tlb = vt;
> +	}

(I am not familiar with this code at all, so apologies in advance if I
am making an idiot out of myself here)

IIUC, the point of holding the lock here is *not* to protect against
concurrent initialization, as in this case the NULL check needs to be
done under the lock.

Rather, the goal is to prevent re-ordering of zeroing from kzalloc and
the assignment to vcpu->arch.vncr_tlb, by depending on the barriers
provided by the lock. The lock is held by the readers so holding it here
conviently means we do not need to add any barriers to the readers.

Is my understanding correct?

If yes, I think the code looks confusing, at least to a layman like
myself. It initially seems like the lock protects against concurrent
initializations, but then the NULL check is not done again under the
lock. The goal of the lock is not clear without the original report.

Mayeb it's clearer to explicitly use barriers if the goal is preventing
reordering?