[PATCH v14 29/44] arm64: RMI: Runtime faulting of memory
Gavin Shan
gshan at redhat.com
Fri Jun 5 04:20:14 PDT 2026
Hi Steve,
On 5/13/26 11:17 PM, Steven Price wrote:
> At runtime if the realm guest accesses memory which hasn't yet been
> mapped then KVM needs to either populate the region or fault the guest.
>
> For memory in the lower (protected) region of IPA a fresh page is
> provided to the RMM which will zero the contents. For memory in the
> upper (shared) region of IPA, the memory from the memslot is mapped
> into the realm VM non secure.
>
> Signed-off-by: Steven Price <steven.price at arm.com>
> ---
> Changes since v13:
> * Numerous changes due to rebasing.
> * Fix addr_range_desc() to encode the correct block size.
> Changes since v12:
> * Switch to RMM v2.0 range based APIs.
> Changes since v11:
> * Adapt to upstream changes.
> Changes since v10:
> * RME->RMI renaming.
> * Adapt to upstream gmem changes.
> Changes since v9:
> * Fix call to kvm_stage2_unmap_range() in kvm_free_stage2_pgd() to set
> may_block to avoid stall warnings.
> * Minor coding style fixes.
> Changes since v8:
> * Propagate the may_block flag.
> * Minor comments and coding style changes.
> Changes since v7:
> * Remove redundant WARN_ONs for realm_create_rtt_levels() - it will
> internally WARN when necessary.
> Changes since v6:
> * Handle PAGE_SIZE being larger than RMM granule size.
> * Some minor renaming following review comments.
> Changes since v5:
> * Reduce use of struct page in preparation for supporting the RMM
> having a different page size to the host.
> * Handle a race when delegating a page where another CPU has faulted on
> a the same page (and already delegated the physical page) but not yet
> mapped it. In this case simply return to the guest to either use the
> mapping from the other CPU (or refault if the race is lost).
> * The changes to populate_par_region() are moved into the previous
> patch where they belong.
> Changes since v4:
> * Code cleanup following review feedback.
> * Drop the PTE_SHARED bit when creating unprotected page table entries.
> This is now set by the RMM and the host has no control of it and the
> spec requires the bit to be set to zero.
> Changes since v2:
> * Avoid leaking memory if failing to map it in the realm.
> * Correctly mask RTT based on LPA2 flag (see rtt_get_phys()).
> * Adapt to changes in previous patches.
> ---
> arch/arm64/include/asm/kvm_emulate.h | 8 ++
> arch/arm64/include/asm/kvm_rmi.h | 12 ++
> arch/arm64/kvm/mmu.c | 128 ++++++++++++++++----
> arch/arm64/kvm/rmi.c | 173 +++++++++++++++++++++++++++
> 4 files changed, 301 insertions(+), 20 deletions(-)
>
[...]
> @@ -1604,27 +1641,52 @@ static int gmem_abort(const struct kvm_s2_fault_desc *s2fd)
> bool write_fault, exec_fault;
> enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_SHARED;
> enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
> - struct kvm_pgtable *pgt = s2fd->vcpu->arch.hw_mmu->pgt;
> + struct kvm_vcpu *vcpu = s2fd->vcpu;
> + struct kvm_pgtable *pgt = vcpu->arch.hw_mmu->pgt;
> + gpa_t gpa = kvm_gpa_from_fault(vcpu->kvm, s2fd->fault_ipa);
> unsigned long mmu_seq;
> struct page *page;
> - struct kvm *kvm = s2fd->vcpu->kvm;
> + struct kvm *kvm = vcpu->kvm;
> void *memcache;
> kvm_pfn_t pfn;
> gfn_t gfn;
> int ret;
>
> - memcache = get_mmu_memcache(s2fd->vcpu);
> - ret = topup_mmu_memcache(s2fd->vcpu, memcache);
> + if (kvm_is_realm(vcpu->kvm)) {
> + /* check for memory attribute mismatch */
> + bool is_priv_gfn = kvm_mem_is_private(kvm, gpa >> PAGE_SHIFT);
> + /*
> + * For Realms, the shared address is an alias of the private
> + * PA with the top bit set. Thus if the fault address matches
> + * the GPA then it is the private alias.
> + */
> + bool is_priv_fault = (gpa == s2fd->fault_ipa);
> +
> + if (is_priv_gfn != is_priv_fault) {
> + kvm_prepare_memory_fault_exit(vcpu, gpa, PAGE_SIZE,
> + kvm_is_write_fault(vcpu),
> + false,
> + is_priv_fault);
> + /*
> + * KVM_EXIT_MEMORY_FAULT requires an return code of
> + * -EFAULT, see the API documentation
> + */
> + return -EFAULT;
> + }
> + }
> +
For a Realm, gmem_abort() is called by kvm_handle_guest_abort() only when
we're faulting in the private (protected) space.
if (kvm_slot_has_gmem(memslot) && !shared_ipa_fault(vcpu->kvm, fault_ipa))
ret = gmem_abort(&s2fd);
else
ret = user_mem_abort(&s2fd);
With the condition, this block of code can be simplied to handle conversion
(shared -> private) instead of both directions.
/* Convert the shared address to the private adress for Realm */
if (kvm_is_realm(vcpu->kvm) &&
!kvm_mem_is_private(kvm, gpa >> PAGE_SHIFT)) {
/*
* KVM_EXIT_MEMORY_FAULT requires an return code of
* -EFAULT, see the API documentation
*/
kvm_prepare_memory_fault_exit(vcpu, gpa, PAGE_SIZE,
kvm_is_write_fault(vcpu),
false, true);
return -EFAULT;
}
[...]
> @@ -2396,7 +2475,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
> !write_fault &&
> !kvm_vcpu_trap_is_exec_fault(vcpu));
>
> - if (kvm_slot_has_gmem(memslot))
> + if (kvm_slot_has_gmem(memslot) && !shared_ipa_fault(vcpu->kvm, fault_ipa))
> ret = gmem_abort(&s2fd);
> else
> ret = user_mem_abort(&s2fd);
gmem_abort() is only called for faults in the protected (private) space.
Thanks,
Gavin
More information about the linux-arm-kernel
mailing list