[PATCH v14 29/44] arm64: RMI: Runtime faulting of memory
Suzuki K Poulose
suzuki.poulose at arm.com
Mon Jun 8 05:58:23 PDT 2026
On 08/06/2026 11:56, Steven Price wrote:
> On 08/06/2026 10:30, Suzuki K Poulose wrote:
>> On 05/06/2026 07:23, Gavin Shan wrote:
>>> Hi Steve,
>>>
>>> On 5/13/26 11:17 PM, Steven Price wrote:
>>>> At runtime if the realm guest accesses memory which hasn't yet been
>>>> mapped then KVM needs to either populate the region or fault the guest.
>>>>
>>>> For memory in the lower (protected) region of IPA a fresh page is
>>>> provided to the RMM which will zero the contents. For memory in the
>>>> upper (shared) region of IPA, the memory from the memslot is mapped
>>>> into the realm VM non secure.
>>>>
>>>> Signed-off-by: Steven Price <steven.price at arm.com>
>>>> ---
>>>> Changes since v13:
>>>> * Numerous changes due to rebasing.
>>>> * Fix addr_range_desc() to encode the correct block size.
>>>> Changes since v12:
>>>> * Switch to RMM v2.0 range based APIs.
>>>> Changes since v11:
>>>> * Adapt to upstream changes.
>>>> Changes since v10:
>>>> * RME->RMI renaming.
>>>> * Adapt to upstream gmem changes.
>>>> Changes since v9:
>>>> * Fix call to kvm_stage2_unmap_range() in kvm_free_stage2_pgd() to set
>>>> may_block to avoid stall warnings.
>>>> * Minor coding style fixes.
>>>> Changes since v8:
>>>> * Propagate the may_block flag.
>>>> * Minor comments and coding style changes.
>>>> Changes since v7:
>>>> * Remove redundant WARN_ONs for realm_create_rtt_levels() - it will
>>>> internally WARN when necessary.
>>>> Changes since v6:
>>>> * Handle PAGE_SIZE being larger than RMM granule size.
>>>> * Some minor renaming following review comments.
>>>> Changes since v5:
>>>> * Reduce use of struct page in preparation for supporting the RMM
>>>> having a different page size to the host.
>>>> * Handle a race when delegating a page where another CPU has
>>>> faulted on
>>>> a the same page (and already delegated the physical page) but not
>>>> yet
>>>> mapped it. In this case simply return to the guest to either use the
>>>> mapping from the other CPU (or refault if the race is lost).
>>>> * The changes to populate_par_region() are moved into the previous
>>>> patch where they belong.
>>>> Changes since v4:
>>>> * Code cleanup following review feedback.
>>>> * Drop the PTE_SHARED bit when creating unprotected page table
>>>> entries.
>>>> This is now set by the RMM and the host has no control of it and the
>>>> spec requires the bit to be set to zero.
>>>> Changes since v2:
>>>> * Avoid leaking memory if failing to map it in the realm.
>>>> * Correctly mask RTT based on LPA2 flag (see rtt_get_phys()).
>>>> * Adapt to changes in previous patches.
>>>> ---
>>>> arch/arm64/include/asm/kvm_emulate.h | 8 ++
>>>> arch/arm64/include/asm/kvm_rmi.h | 12 ++
>>>> arch/arm64/kvm/mmu.c | 128 ++++++++++++++++----
>>>> arch/arm64/kvm/rmi.c | 173 +++++++++++++++++++++++++++
>>>> 4 files changed, 301 insertions(+), 20 deletions(-)
>>>>
>
> [...]
>
>>>> diff --git a/arch/arm64/kvm/rmi.c b/arch/arm64/kvm/rmi.c
>>>> index cae29fd3353c..761b38a4071c 100644
>>>> --- a/arch/arm64/kvm/rmi.c
>>>> +++ b/arch/arm64/kvm/rmi.c
>>>> @@ -597,6 +597,179 @@ static int realm_data_map_init(struct kvm *kvm,
>>>> unsigned long ipa,
>>>> return ret;
>>>> }
>>>> +static unsigned long addr_range_desc(unsigned long phys, unsigned
>>>> long size)
>>>> +{
>>>> + unsigned long out = 0;
>>>> +
>>>> + switch (size) {
>>>> + case P4D_SIZE:
>>>> + out = 3 | (1 << 2);
>>>> + break;
>>>> + case PUD_SIZE:
>>>> + out = 2 | (1 << 2);
>>>> + break;
>>>> + case PMD_SIZE:
>>>> + out = 1 | (1 << 2);
>>>> + break;
>>>> + case PAGE_SIZE:
>>>> + out = 0 | (1 << 2);
>>>> + break;
>>>> + default:
>>>> + /*
>>>> + * Only support mapping at the page level granulatity when
>>>> + * it's an unusual length. This should get us back onto a
>>>> larger
>>>> + * block size for the subsequent mappings.
>>>> + */
>>>> + out = 0 | ((MIN(size >> PAGE_SHIFT, PTRS_PER_PTE - 1)) << 2);
>>>> + break;
>>>> + }
>>>> +
>>>> + WARN_ON(phys & ~PAGE_MASK);
>>>> +
>>>> + out |= phys & PAGE_MASK;
>>>> +
>>>> + return out;
>>>> +}
>>>> +
>>>> +int realm_map_protected(struct kvm *kvm,
>>>> + unsigned long ipa,
>>>> + kvm_pfn_t pfn,
>>>> + unsigned long map_size,
>>>> + struct kvm_mmu_memory_cache *memcache)
>>>> +{
>>>> + struct realm *realm = &kvm->arch.realm;
>>>> + phys_addr_t phys = __pfn_to_phys(pfn);
>>>> + phys_addr_t base_phys = phys;
>>>> + phys_addr_t rd = virt_to_phys(realm->rd);
>>>> + unsigned long base_ipa = ipa;
>>>> + unsigned long ipa_top = ipa + map_size;
>>>> + int ret = 0;
>>>> +
>>>> + if (WARN_ON(!IS_ALIGNED(map_size, PAGE_SIZE) ||
>>>> + !IS_ALIGNED(ipa, map_size)))
>>>> + return -EINVAL;
>>>> +
>>>> + if (rmi_delegate_range(phys, map_size)) {
>>>> + /*
>>>> + * It's likely we raced with another VCPU on the same
>>>> + * fault. Assume the other VCPU has handled the fault
>>>> + * and return to the guest.
>>>> + */
>>>> + return 0;
>>>> + }
>>>> +
>>>> + while (ipa < ipa_top) {
>>>> + unsigned long flags = RMI_ADDR_TYPE_SINGLE;
>>>> + unsigned long range_desc = addr_range_desc(phys, ipa_top -
>>>> ipa);
>>>> + unsigned long out_top;
>>>> +
>>>> + ret = rmi_rtt_data_map(rd, ipa, ipa_top, flags, range_desc,
>>>> + &out_top);
>>>> +
>>>> + if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
>>>> + /* Create missing RTTs and retry */
>>>> + int level = RMI_RETURN_INDEX(ret);
>>>> +
>>>> + WARN_ON(level == KVM_PGTABLE_LAST_LEVEL);
>>>> + ret = realm_create_rtt_levels(realm, ipa, level,
>>>> + KVM_PGTABLE_LAST_LEVEL,
>>>> + memcache);
>>
>> Could we give the RMM a chance to make use of the Block mappings by
>> creating the Missing RTTs to the level that may work for the current
>> range_desc ? i.e., if the range_desc is a 2M block size, we could create
>> tables upto L2 in the first go and if the RMM still needs RTT, we could
>> go further down to the KVM_PGTABLE_LAST_LEVEL. I understand this is
>> kind of an optimisation, so may be we could defer it. (Same applies for
>> the non_secure map below).
>
> A simple change would be just to create one level at a time like this:
>
> diff --git a/arch/arm64/kvm/rmi.c b/arch/arm64/kvm/rmi.c
> index b79b96f7dffb..3f3ade1d3895 100644
> --- a/arch/arm64/kvm/rmi.c
> +++ b/arch/arm64/kvm/rmi.c
> @@ -767,15 +767,15 @@ static int realm_map_protected(struct kvm *kvm,
> /* Create missing RTTs and retry */
> int level = RMI_RETURN_INDEX(ret);
>
> - WARN_ON(level == KVM_PGTABLE_LAST_LEVEL);
> + if (WARN_ON(level >= KVM_PGTABLE_LAST_LEVEL))
> + goto err_undelegate;
> ret = realm_create_rtt_levels(realm, ipa, level,
> - KVM_PGTABLE_LAST_LEVEL,
> + level + 1,
> memcache);
> if (ret)
> goto err_undelegate;
>
> - ret = rmi_rtt_data_map(rd, ipa, ipa_top, flags,
> - range_desc, &out_top);
> + continue;
> }
That looks good to me.
Cheers
Suzuki
>
> if (WARN_ON(ret))
>
> Thanks,
> Steve
>
More information about the linux-arm-kernel
mailing list