[PATCH v3 49/60] arm64: Enable LPA2 at boot if supported by the system

Ryan Roberts ryan.roberts at arm.com
Tue Apr 18 06:50:19 PDT 2023


On 07/03/2023 14:05, Ard Biesheuvel wrote:
> Update the early kernel mapping code to take 52-bit virtual addressing
> into account based on the LPA2 feature. This is a bit more involved than
> LVA (which is supported with 64k pages only), given that some page table
> descriptor bits change meaning in this case.
> 
> To keep the handling in asm to a minimum, the initial ID map is still
> created with 48-bit virtual addressing, which implies that the kernel
> image must be loaded into 48-bit addressable physical memory. This is
> currently required by the boot protocol, even though we happen to
> support placement outside of that for LVA/64k based configurations.
> 
> Enabling LPA2 involves more than setting TCR.T1SZ to a lower value,
> there is also a DS bit in TCR that needs to be set, and which changes
> the meaning of bits [9:8] in all page table descriptors. Since we cannot
> enable DS and every live page table descriptor at the same time, let's
> pivot through another temporary mapping. This avoids the need to
> reintroduce manipulations of the page tables with the MMU and caches
> disabled.
> 
> To permit the LPA2 feature to be overridden on the kernel command line,
> which may be necessary to work around silicon errata, or to deal with
> mismatched features on heterogeneous SoC designs, test for CPU feature
> overrides first, and only then enable LPA2.
> 
> Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
> ---
>  arch/arm64/include/asm/assembler.h  |  8 ++-
>  arch/arm64/include/asm/cpufeature.h | 18 +++++
>  arch/arm64/include/asm/memory.h     |  4 ++
>  arch/arm64/kernel/head.S            |  8 +++
>  arch/arm64/kernel/image-vars.h      |  1 +
>  arch/arm64/kernel/pi/map_kernel.c   | 70 +++++++++++++++++++-
>  arch/arm64/kernel/pi/map_range.c    | 11 ++-
>  arch/arm64/kernel/pi/pi.h           |  4 +-
>  arch/arm64/mm/init.c                |  2 +-
>  arch/arm64/mm/mmu.c                 |  6 +-
>  arch/arm64/mm/proc.S                |  3 +
>  11 files changed, 124 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index 55e8731844cf7eb7..d5e139ce0820479b 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -581,11 +581,17 @@ alternative_endif
>   * but we have to add an offset so that the TTBR1 address corresponds with the
>   * pgdir entry that covers the lowest 48-bit addressable VA.
>   *
> + * Note that this trick is only used for LVA/64k pages - LPA2/4k pages uses an
> + * additional paging level, and on LPA2/16k pages, we would end up with a root
> + * level table with only 2 entries, which is suboptimal in terms of TLB
> + * utilization, so there we fall back to 47 bits of translation if LPA2 is not
> + * supported.
> + *
>   * orr is used as it can cover the immediate value (and is idempotent).
>   * 	ttbr: Value of ttbr to set, modified.
>   */
>  	.macro	offset_ttbr1, ttbr, tmp
> -#ifdef CONFIG_ARM64_VA_BITS_52
> +#if defined(CONFIG_ARM64_VA_BITS_52) && !defined(CONFIG_ARM64_LPA2)
>  	mrs	\tmp, tcr_el1
>  	and	\tmp, \tmp, #TCR_T1SZ_MASK
>  	cmp	\tmp, #TCR_T1SZ(VA_BITS_MIN)
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index 7faf9a48339e7c8c..170e18cb2b4faf11 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -1002,6 +1002,24 @@ static inline bool cpu_has_lva(void)
>  						    ID_AA64MMFR2_EL1_VARange_SHIFT);
>  }
>  
> +static inline bool cpu_has_lpa2(void)
> +{
> +#ifdef CONFIG_ARM64_LPA2
> +	u64 mmfr0;
> +	int feat;
> +
> +	mmfr0 = read_sysreg(id_aa64mmfr0_el1);
> +	mmfr0 &= ~id_aa64mmfr0_override.mask;
> +	mmfr0 |= id_aa64mmfr0_override.val;
> +	feat = cpuid_feature_extract_signed_field(mmfr0,
> +						  ID_AA64MMFR0_EL1_TGRAN_SHIFT);
> +
> +	return feat >= ID_AA64MMFR0_EL1_TGRAN_LPA2;
> +#else
> +	return false;
> +#endif
> +}

I wonder if we should rename this to cpu_has_lpa2_stage1()? I currently have a
system_supports_lpa2() function, which wraps a system cap, and reports true if
BOTH stage1 and stage2 are supported. I suspect this should be renamed to
something like system_has_lpa2_stage12() to match?

Regardless, in my series, KVM currently decides whether or not to use LPA2 page
tables based on system_supports_lpa2(). But I will need to add a new condition
whereby if the kernel is using LPA2 (lpa2_is_enabled()) but stage2 reports that
LPA2 is not supported then KVM fails to initialize. Otherwise we could end up in
a situation where KVM can't map memory passed to it by the kernel.




More information about the linux-arm-kernel mailing list