[PATCH v5 07/12] KVM: arm64: Use LPA2 page-tables for stage2 and hyp stage1

Marc Zyngier maz at kernel.org
Wed Nov 22 07:21:04 PST 2023


On Wed, 22 Nov 2023 13:41:33 +0000,
Ryan Roberts <ryan.roberts at arm.com> wrote:
> 
> On 21/11/2023 20:34, Oliver Upton wrote:
> > On Thu, Nov 16, 2023 at 02:29:26PM +0000, Ryan Roberts wrote:
> >> Implement a simple policy whereby if the HW supports FEAT_LPA2 for the
> >> page size we are using, always use LPA2-style page-tables for stage 2
> >> and hyp stage 1 (assuming an nvhe hyp), regardless of the VMM-requested
> >> IPA size or HW-implemented PA size. When in use we can now support up to
> >> 52-bit IPA and PA sizes.
> >>
> >> We use the previously created cpu feature to track whether LPA2 is
> >> supported for deciding whether to use the LPA2 or classic pte format.
> >>
> >> Note that FEAT_LPA2 brings support for bigger block mappings (512GB with
> >> 4KB, 64GB with 16KB). We explicitly don't enable these in the library
> >> because stage2_apply_range() works on batch sizes of the largest used
> >> block mapping, and increasing the size of the batch would lead to soft
> >> lockups. See commit 5994bc9e05c2 ("KVM: arm64: Limit
> >> stage2_apply_range() batch size to largest block").
> >>
> >> With the addition of LPA2 support in the hypervisor, the PA size
> >> supported by the HW must be capped with a runtime decision, rather than
> >> simply using a compile-time decision based on PA_BITS. For example, on a
> >> system that advertises 52 bit PA but does not support FEAT_LPA2, A 4KB
> >> or 16KB kernel compiled with LPA2 support must still limit the PA size
> >> to 48 bits.
> >>
> >> Therefore, move the insertion of the PS field into TCR_EL2 out of
> >> __kvm_hyp_init assembly code and instead do it in cpu_prepare_hyp_mode()
> >> where the rest of TCR_EL2 is prepared. This allows us to figure out PS
> >> with kvm_get_parange(), which has the appropriate logic to ensure the
> >> above requirement. (and the PS field of VTCR_EL2 is already populated
> >> this way).
> >>
> >> Signed-off-by: Ryan Roberts <ryan.roberts at arm.com>
> >> ---
> >>  arch/arm64/include/asm/kvm_mmu.h     |  2 +-
> >>  arch/arm64/include/asm/kvm_pgtable.h | 47 +++++++++++++++++++++-------
> >>  arch/arm64/kvm/arm.c                 |  5 +++
> >>  arch/arm64/kvm/hyp/nvhe/hyp-init.S   |  4 ---
> >>  arch/arm64/kvm/hyp/pgtable.c         | 15 +++++++--
> >>  5 files changed, 54 insertions(+), 19 deletions(-)
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> >> index 31e8d7faed65..f4e4fcb35afc 100644
> >> --- a/arch/arm64/include/asm/kvm_mmu.h
> >> +++ b/arch/arm64/include/asm/kvm_mmu.h
> >> @@ -340,7 +340,7 @@ static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
> >>  	return container_of(mmu->arch, struct kvm, arch);
> >>  }
> >>  
> >> -#define kvm_lpa2_is_enabled()		false
> >> +#define kvm_lpa2_is_enabled()		system_supports_lpa2()
> > 
> > Can we use this predicate consistently throughout the KVM code? Looks
> > like the rest of this diff is using system_supports_lpa2() directly.
> 
> My thinking was that system_supports_lpa2() is an input to KVM's policy to
> decide if it is going to use LPA2 (currently that policy is very simple - if the
> system supports it, then KVM uses it - but it doesn't have to be that way), and
> kvm_lpa2_is_enabled() is how KVM exports its policy decision, so one is an input
> and the other is an output.
> 
> It's a lightly held opinion though - I'll make the change if you insist? :)

<bikeshed>
I personally don't find this dichotomy very useful. It could make
sense if we used the page table walker for S1 outside of KVM, but
that's not the case at the moment.

If there is no plan for such a use case, I'd rather see a single
predicate, making the code a bit more readable.
</bikeshed>

	M.

-- 
Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list