[PATCH v8 15/43] arm64: RME: Allow VMM to set RIPAS
Gavin Shan
gshan at redhat.com
Thu May 1 16:59:24 PDT 2025
On 5/2/25 2:00 AM, Steven Price wrote:
> On 30/04/2025 12:38, Gavin Shan wrote:
>> On 4/16/25 11:41 PM, Steven Price wrote:
>>> Each page within the protected region of the realm guest can be marked
>>> as either RAM or EMPTY. Allow the VMM to control this before the guest
>>> has started and provide the equivalent functions to change this (with
>>> the guest's approval) at runtime.
>>>
>>> When transitioning from RIPAS RAM (1) to RIPAS EMPTY (0) the memory is
>>> unmapped from the guest and undelegated allowing the memory to be reused
>>> by the host. When transitioning to RIPAS RAM the actual population of
>>> the leaf RTTs is done later on stage 2 fault, however it may be
>>> necessary to allocate additional RTTs to allow the RMM track the RIPAS
>>> for the requested range.
>>>
>>> When freeing a block mapping it is necessary to temporarily unfold the
>>> RTT which requires delegating an extra page to the RMM, this page can
>>> then be recovered once the contents of the block mapping have been
>>> freed.
>>>
>>> Signed-off-by: Steven Price <steven.price at arm.com>
>>> ---
>>> Changes from v7:
>>> * Replace use of "only_shared" with the upstream "attr_filter" field
>>> of struct kvm_gfn_range.
>>> * Clean up the logic in alloc_delegated_granule() for when to call
>>> kvm_account_pgtable_pages().
>>> * Rename realm_destroy_protected_granule() to
>>> realm_destroy_private_granule() to match the naming elsewhere. Also
>>> fix the return codes in the function to be descriptive.
>>> * Several other minor changes to names/return codes.
>>> Changes from v6:
>>> * Split the code dealing with the guest triggering a RIPAS change into
>>> a separate patch, so this patch is purely for the VMM setting up the
>>> RIPAS before the guest first runs.
>>> * Drop the useless flags argument from alloc_delegated_granule().
>>> * Account RTTs allocated for a guest using kvm_account_pgtable_pages().
>>> * Deal with the RMM granule size potentially being smaller than the
>>> host's PAGE_SIZE. Although note alloc_delegated_granule() currently
>>> still allocates an entire host page for every RMM granule (so wasting
>>> memory when PAGE_SIZE>4k).
>>> Changes from v5:
>>> * Adapt to rebasing.
>>> * Introduce find_map_level()
>>> * Rename some functions to be clearer.
>>> * Drop the "spare page" functionality.
>>> Changes from v2:
>>> * {alloc,free}_delegated_page() moved from previous patch to this one.
>>> * alloc_delegated_page() now takes a gfp_t flags parameter.
>>> * Fix the reference counting of guestmem pages to avoid leaking memory.
>>> * Several misc code improvements and extra comments.
>>> ---
>>> arch/arm64/include/asm/kvm_rme.h | 5 +
>>> arch/arm64/kvm/mmu.c | 8 +-
>>> arch/arm64/kvm/rme.c | 384 +++++++++++++++++++++++++++++++
>>> 3 files changed, 394 insertions(+), 3 deletions(-)
>>>
.../...
>>> +static int kvm_init_ipa_range_realm(struct kvm *kvm,
>>> + struct arm_rme_init_ripas *args)
>>> +{
>>> + gpa_t addr, end;
>>> + struct realm *realm = &kvm->arch.realm;
>>> +
>>> + addr = args->base;
>>> + end = addr + args->size;
>>> +
>>> + if (end < addr)
>>> + return -EINVAL;
>>
>> The check needs to cover 'end <= addr'. RMI_ERROR_INPUT is returned from
>> RMM::smc_rtt_init_ripas()
>> if 'end' is equal to 'addr', but we're returning 0, inconsistent to that.
>
> I agree we're different to smc_rtt_init_ripas(), but I don't really see
> why we should prevent args->size==0. Calling the low level SMC in that
> case would clearly be wrong (the kernel should be validating and that
> would show a lack of validation), but we handle that with the while loop
> in realm_init_ipa_state().
>
> Do you think it's important to define this uAPI to disallow size==0?
>
No, it's not a big deal since it's just a nitpick. The current implementation
isn't wrong because 0 will be returned when size is 0, meaning it succeeds
but no work to do there. Please leave the code as of being :)
>> if (end <= addr)
>> return -EINVAL;
>>
>>> +
>>> + if (kvm_realm_state(kvm) != REALM_STATE_NEW)
>>> + return -EPERM;
>>
>> To keep the consistency, kvm_realm_is_created() can be used here.
>>
>> if (kvm_realm_is_created(kvm))
>> return -EPERM;
>
> This isn't the same - kvm_realm_is_create() is checking for
> REALM_STATE_NONE, but we want to check for REALM_STATE_NEW.
>
Hmm, sorry for the noise. The current code is good enough then :)
Thanks,
Gavin
More information about the linux-arm-kernel
mailing list