[PATCH v2] RISC-V: KVM: Fix use-after-free in kvm_riscv_gstage_get_leaf()

Anup Patel anup at brainfault.org
Thu Feb 26 00:26:54 PST 2026


On Mon, Feb 2, 2026 at 9:31 AM Jiakai Xu <xujiakai2025 at iscas.ac.cn> wrote:
>
> While fuzzing KVM on RISC-V, a use-after-free was observed in
> kvm_riscv_gstage_get_leaf(),  where ptep_get() dereferences a
> freed gstage page table page during gfn unmap.
>
> The crash manifests as:
>   use-after-free in ptep_get include/linux/pgtable.h:340 [inline]
>   use-after-free in kvm_riscv_gstage_get_leaf arch/riscv/kvm/gstage.c:89
>   Call Trace:
>     ptep_get include/linux/pgtable.h:340 [inline]
>     kvm_riscv_gstage_get_leaf+0x2ea/0x358 arch/riscv/kvm/gstage.c:89
>     kvm_riscv_gstage_unmap_range+0xf0/0x308 arch/riscv/kvm/gstage.c:265
>     kvm_unmap_gfn_range+0x168/0x1fc arch/riscv/kvm/mmu.c:256
>     kvm_mmu_unmap_gfn_range virt/kvm/kvm_main.c:724 [inline]
>   page last free pid 808 tgid 808 stack trace:
>     kvm_riscv_mmu_free_pgd+0x1b6/0x26a arch/riscv/kvm/mmu.c:457
>     kvm_arch_flush_shadow_all+0x1a/0x24 arch/riscv/kvm/mmu.c:134
>     kvm_flush_shadow_all virt/kvm/kvm_main.c:344 [inline]
>
> The UAF is caused by gstage page table walks running concurrently with
> gstage pgd teardown. In particular, kvm_unmap_gfn_range() can traverse
> gstage page tables while kvm_arch_flush_shadow_all() frees the pgd,
> leading to use-after-free of page table pages.
>
> Fix the issue by serializing gstage unmap and pgd teardown with
> kvm->mmu_lock. Holding mmu_lock ensures that gstage page tables
> remain valid for the duration of unmap operations and prevents
> concurrent frees.
>
> This matches existing RISC-V KVM usage of mmu_lock to protect gstage
> map/unmap operations, e.g. kvm_riscv_mmu_iounmap.
>
> Fixes: dd82e35638d67f ("RISC-V: KVM: Factor-out g-stage page table management")
> Signed-off-by: Jiakai Xu <xujiakai2025 at iscas.ac.cn>
> Signed-off-by: Jiakai Xu <jiakaiPeanut at gmail.com>
> ---
> V1 -> V2: Removed kvm->mmu_lock in kvm_arch_flush_shadow_all().
>
>  arch/riscv/kvm/mmu.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
> index a1c3b2ec1dde5..1d71c1cb429ca 100644
> --- a/arch/riscv/kvm/mmu.c
> +++ b/arch/riscv/kvm/mmu.c
> @@ -268,9 +268,11 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
>         gstage.flags = 0;
>         gstage.vmid = READ_ONCE(kvm->arch.vmid.vmid);
>         gstage.pgd = kvm->arch.pgd;
> +       spin_lock(&kvm->mmu_lock);

Unconditionally locking mmu_lock over here cause following crash
when powering off the KVM Guest.

[   88.985889] rcu: INFO: rcu_sched self-detected stall on CPU
[   88.986721] rcu:     1-....: (5249 ticks this GP)
idle=9184/1/0x4000000000000000 softirq=175/175 fqs=2223
[   88.987816] rcu:     (t=5250 jiffies g=-791 q=31 ncpus=4)
[   88.988993] CPU: 1 UID: 0 PID: 78 Comm: lkvm-static Not tainted
7.0.0-rc1-00002-gf242f3f353e6-dirty #3 PREEMPTLAZY
[   88.989294] Hardware name: riscv-virtio,qemu (DT)
[   88.989401] epc : queued_spin_lock_slowpath+0x54/0x474
[   88.990144]  ra : do_raw_spin_lock+0xaa/0xd0
[   88.990182] epc : ffffffff80bc7404 ra : ffffffff800893ea sp :
ff200000003bb8d0
[   88.990213]  gp : ffffffff81a32490 tp : ff60000002360c80 t0 :
616d6e755f6d766b
[   88.990231]  t1 : 00000000fffff000 t2 : 70616d6e755f6d76 s0 :
ff200000003bb8e0
[   88.990286]  s1 : 00007fff7f600000 a0 : 0000000000000000 a1 :
ff600000047b7000
[   88.990304]  a2 : 00000000000000ff a3 : 0000000000000000 a4 :
0000000000000001
[   88.990322]  a5 : ff600000047b7000 a6 : ffffffff81876808 a7 :
80000000fffff000
[   88.990341]  s2 : ff600000047b7000 s3 : ff600000047b7a90 s4 :
0000000000000000
[   88.990359]  s5 : 00007fff8f600000 s6 : 0000000000000001 s7 :
0000000000000001
[   88.990378]  s8 : 0000000000000fff s9 : ff600000047b7488 s10:
ffffffffffffffe0
[   88.990396]  s11: ff60000003b8e050 t3 : ffffffff81a49eb7 t4 :
ffffffff81a49eb7
[   88.990433]  t5 : ffffffff81a49eb8 t6 : ff200000003bb728 ssp :
0000000000000000
[   88.990451] status: 0000000200000120 badaddr: 0000000000000000
cause: 8000000000000005
[   88.990581] [<ffffffff80bc7404>] queued_spin_lock_slowpath+0x54/0x474
[   88.990696] [<ffffffff80bc704e>] _raw_spin_lock+0x1a/0x24
[   88.990917] [<ffffffff01ab4a58>] kvm_unmap_gfn_range+0x98/0xc8 [kvm]
[   88.991415] [<ffffffff01aa5d22>]
kvm_mmu_notifier_invalidate_range_start+0x17e/0x324 [kvm]
[   88.991608] [<ffffffff8027f6da>]
__mmu_notifier_invalidate_range_start+0x62/0x1bc
[   88.991635] [<ffffffff8022d554>] unmap_vmas+0x120/0x134
[   88.991654] [<ffffffff8024cc0a>] unmap_region+0x76/0xc0
[   88.991675] [<ffffffff8024cd18>] vms_complete_munmap_vmas+0xc4/0x1c0
[   88.991695] [<ffffffff8024dd5e>] do_vmi_align_munmap+0x152/0x178
[   88.991716] [<ffffffff8024de24>] do_vmi_munmap+0xa0/0x148
[   88.991736] [<ffffffff8024f4b2>] __vm_munmap+0xaa/0x140
[   88.991757] [<ffffffff802389c8>] __riscv_sys_munmap+0x38/0x40
[   88.991778] [<ffffffff80bbb048>] do_trap_ecall_u+0x260/0x45c
[   88.991812] [<ffffffff80bc87a0>] handle_exception+0x168/0x174

Instead, we should only take mmu_lock if it was unlocked previously.

Something like this ...

diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
index 0b75eb2a1820..87c8f41482c5 100644
--- a/arch/riscv/kvm/mmu.c
+++ b/arch/riscv/kvm/mmu.c
@@ -245,6 +245,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
 {
        struct kvm_gstage gstage;
+       bool mmu_locked;

        if (!kvm->arch.pgd)
                return false;
@@ -253,9 +254,12 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct
kvm_gfn_range *range)
        gstage.flags = 0;
        gstage.vmid = READ_ONCE(kvm->arch.vmid.vmid);
        gstage.pgd = kvm->arch.pgd;
+       mmu_locked = spin_trylock(&kvm->mmu_lock);
        kvm_riscv_gstage_unmap_range(&gstage, range->start << PAGE_SHIFT,
                                     (range->end - range->start) << PAGE_SHIFT,
                                     range->may_block);
+       if (mmu_locked)
+               spin_unlock(&kvm->mmu_lock);
        return false;
 }

I have take care of this and queued as fix for Linux-7.0-rcX

Regards,
Anup



More information about the linux-riscv mailing list