[PATCH 5/8] KVM: arm64: Propagate stage-2 map failure on host->guest share
Fuad Tabba
tabba at google.com
Tue Apr 28 03:30:05 PDT 2026
__pkvm_host_share_guest() mutates the host vmemmap for every page in
the range (sets PKVM_PAGE_SHARED_OWNED and increments
host_share_guest_count) and then calls kvm_pgtable_stage2_map() to
install the guest stage-2 mapping. The stage-2 map's return value was
wrapped in WARN_ON() and otherwise discarded.
At EL2 in nVHE/pKVM, WARN_ON() is not warn-and-continue: it expands
to a BRK that enters the invalid-host-el2 vector and branches to
hyp_panic(), declared __noreturn. WARN_ON of a reachable failure at
EL2 is a panic primitive, not a debug aid.
kvm_pgtable_stage2_map() can fail in reachable ways: the stage-2
walker requests fresh pages from the caller's memcache and returns
-ENOMEM when the memcache is exhausted mid-walk. The host controls
the vcpu memcache via the topup interface, so an under-provisioned
share request converts a recoverable error into a fatal hyp panic.
Capture the stage-2 map return value and propagate it. The walker
may have installed leaf entries for some pages in the IPA range
before failing, so unmap the range to clear any partial mappings;
otherwise the guest would retain stage-2 access to pages the host is
about to reclaim. Then roll back the host vmemmap mutations from the
forward pass: the forward pass increments the count by 1 on every
page, and the only forward state transition is OWNED -> SHARED_OWNED
(the count 0 -> 1 transition). The reverse pass decrements the count
and, if it drops back to zero, restores PKVM_PAGE_OWNED. Pages
already SHARED_OWNED with other sharers (count > 1 after the forward
pass) only need the count decremented.
Fixes: d0bd3e6570ae ("KVM: arm64: Introduce __pkvm_host_share_guest()")
Signed-off-by: Fuad Tabba <tabba at google.com>
---
arch/arm64/kvm/hyp/nvhe/mem_protect.c | 30 ++++++++++++++++++++++++---
1 file changed, 27 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 28a471d1927c..7044913a0758 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -1458,9 +1458,33 @@ int __pkvm_host_share_guest(u64 pfn, u64 gfn, u64 nr_pages, struct pkvm_hyp_vcpu
page->host_share_guest_count++;
}
- WARN_ON(kvm_pgtable_stage2_map(&vm->pgt, ipa, size, phys,
- pkvm_mkstate(prot, PKVM_PAGE_SHARED_BORROWED),
- &vcpu->vcpu.arch.pkvm_memcache, 0));
+ ret = kvm_pgtable_stage2_map(&vm->pgt, ipa, size, phys,
+ pkvm_mkstate(prot, PKVM_PAGE_SHARED_BORROWED),
+ &vcpu->vcpu.arch.pkvm_memcache, 0);
+ if (ret) {
+ /*
+ * Stage-2 map can fail mid-walk (e.g. -ENOMEM from the
+ * memcache), leaving partial leaf entries installed in the
+ * guest stage-2. Tear them down before rolling back host
+ * bookkeeping; otherwise the guest would retain access to
+ * pages the host is about to reclaim as PKVM_PAGE_OWNED.
+ */
+ kvm_pgtable_stage2_unmap(&vm->pgt, ipa, size);
+
+ /*
+ * Roll back the host vmemmap mutations applied above. A page
+ * whose host_share_guest_count is now 1 was PKVM_PAGE_OWNED
+ * before this call (count 0->1, state OWNED->SHARED_OWNED);
+ * undo both. A page with count > 1 was already
+ * PKVM_PAGE_SHARED_OWNED with other sharers; only the count
+ * needs to be decremented.
+ */
+ for_each_hyp_page(page, phys, size) {
+ page->host_share_guest_count--;
+ if (!page->host_share_guest_count)
+ set_host_state(page, PKVM_PAGE_OWNED);
+ }
+ }
unlock:
guest_unlock_component(vm);
--
2.54.0.545.g6539524ca2-goog
More information about the linux-arm-kernel
mailing list