[PATCH 8/8] KVM: arm64: Propagate stage-2 map failure on guest->host unshare

Fuad Tabba tabba at google.com
Tue Apr 28 03:30:08 PDT 2026


__pkvm_guest_unshare_host() re-acquires exclusive guest ownership of
a page by (i) annotating the host stage-2 PTE via
host_stage2_set_owner_metadata_locked(), (ii) mapping the page in
the guest stage-2 as PKVM_PAGE_OWNED via kvm_pgtable_stage2_map(),
and (iii) restoring host ownership via
host_stage2_set_owner_locked(). The map's return value was wrapped
in WARN_ON() and otherwise discarded.

At EL2 in nVHE/pKVM, WARN_ON() is not warn-and-continue: it expands
to a BRK that enters the invalid-host-el2 vector and branches to
hyp_panic(), declared __noreturn.

__pkvm_guest_unshare_host() calls get_valid_guest_pte() before the
map, which verifies that a valid last-level (PAGE_SIZE) leaf PTE
already exists for the IPA. Because the leaf and all intermediate
tables are in place, the subsequent kvm_pgtable_stage2_map()
replacing it cannot fail via -ENOMEM: no block to split, no new
tables to install. The failure path is not currently reachable.

Nevertheless, WARN_ON() on any fallible call is the wrong pattern at
EL2. Capture the return value and propagate it. The unmap() and
host-side rollback are kept as defensive guards for the currently
unreachable failure path. The rollback's
WARN_ON(__host_set_page_state_range()) asserts an impossible state:
the host leaf PTE was just written by
host_stage2_set_owner_metadata_locked(), so the reverse idmap
rewrite cannot require new page-table allocation from host_s2_pool.
This is the correct use of WARN_ON at EL2 — an impossible-state
assertion, not a reachable error being ignored.

Fixes: 246c976c370d ("KVM: arm64: Implement the MEM_UNSHARE hypercall for protected VMs")
Signed-off-by: Fuad Tabba <tabba at google.com>
---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 37 ++++++++++++++++++---------
 1 file changed, 25 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 6fb546af699f..12f3ea7a2d75 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -984,14 +984,10 @@ int __pkvm_guest_share_host(struct pkvm_hyp_vcpu *vcpu, u64 gfn)
 				     &vcpu->vcpu.arch.pkvm_memcache, 0);
 	if (ret) {
 		/*
-		 * Stage-2 map can fail mid-walk (e.g. -ENOMEM from the
-		 * memcache), leaving partial leaf entries in the guest
-		 * stage-2 transitioned to PKVM_PAGE_SHARED_OWNED. Tear
-		 * them down so the host does not see a partially-shared
-		 * mapping it has not yet acknowledged via the host
-		 * stage-2 update below. No host bookkeeping needs
-		 * unwinding here: the only mutation prior to the failed
-		 * map is the (now-discarded) guest stage-2 update itself.
+		 * Defensive: get_valid_guest_pte() guarantees a last-level
+		 * leaf PTE already exists, so stage-2 map() cannot currently
+		 * fail here. The unmap() restores the IPA to a clean state as
+		 * a guard should the precondition ever change.
 		 */
 		kvm_pgtable_stage2_unmap(&vm->pgt, ipa, PAGE_SIZE);
 		goto unlock;
@@ -1024,13 +1020,30 @@ int __pkvm_guest_unshare_host(struct pkvm_hyp_vcpu *vcpu, u64 gfn)
 	if (__host_check_page_state_range(phys, PAGE_SIZE, PKVM_PAGE_SHARED_BORROWED))
 		goto unlock;
 
-	ret = 0;
 	meta = host_stage2_encode_gfn_meta(vm, gfn);
 	WARN_ON(host_stage2_set_owner_metadata_locked(phys, PAGE_SIZE,
 						      PKVM_ID_GUEST, meta));
-	WARN_ON(kvm_pgtable_stage2_map(&vm->pgt, ipa, PAGE_SIZE, phys,
-				       pkvm_mkstate(KVM_PGTABLE_PROT_RWX, PKVM_PAGE_OWNED),
-				       &vcpu->vcpu.arch.pkvm_memcache, 0));
+	ret = kvm_pgtable_stage2_map(&vm->pgt, ipa, PAGE_SIZE, phys,
+				     pkvm_mkstate(KVM_PGTABLE_PROT_RWX, PKVM_PAGE_OWNED),
+				     &vcpu->vcpu.arch.pkvm_memcache, 0);
+	if (ret) {
+		/*
+		 * Defensive: get_valid_guest_pte() guarantees a last-level
+		 * leaf PTE already exists, so stage-2 map() cannot currently
+		 * fail here. The unmap() and host-side rollback below are
+		 * kept as guards should the precondition ever change.
+		 */
+		kvm_pgtable_stage2_unmap(&vm->pgt, ipa, PAGE_SIZE);
+
+		/*
+		 * Roll back the host stage-2 mutation above: the host leaf
+		 * PTE was just written by host_stage2_set_owner_metadata_locked(),
+		 * so __host_set_page_state_range() rewrites it in-place
+		 * without needing fresh page-table pages from host_s2_pool.
+		 */
+		WARN_ON(__host_set_page_state_range(phys, PAGE_SIZE,
+						    PKVM_PAGE_SHARED_BORROWED));
+	}
 unlock:
 	guest_unlock_component(vm);
 	host_unlock_component();
-- 
2.54.0.545.g6539524ca2-goog




More information about the linux-arm-kernel mailing list