[PATCH v3 3/5] arm64: Add support for FEAT_HAFT
Yicong Yang
yangyicong at huawei.com
Wed Oct 23 03:30:18 PDT 2024
On 2024/10/23 2:30, Catalin Marinas wrote:
> On Tue, Oct 22, 2024 at 05:27:32PM +0800, Yicong Yang wrote:
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 3e29b44d2d7b..029d7ad89de5 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -2176,6 +2176,21 @@ config ARCH_PKEY_BITS
>> int
>> default 3
>>
>> +config ARM64_HAFT
>> + bool "Support for Hardware managed Access Flag for Table Descriptor"
>
> Super nit: s/Descriptor/Descriptors/
>
sure.
>> + depends on ARM64_HW_AFDBM
>> + default y
>> + help
>> + The ARMv8.9/ARMv9.5 introduces the feature Hardware managed Access
>> + Flag for Table descriptors. When enabled an architectural executed
>> + memory access will update the Access Flag in each Table descriptor
>> + which is accessed during the translation table walk and for which
>> + the Access Flag is 0. The Access Flag of the Table descriptor use
>> + the same bit of PTE_AF.
>> +
>> + The feature will only be enabled if all the CPUs in the system
>> + support this feature. If unsure, say Y.
>> +
>> endmenu # "ARMv8.9 architectural features"
>>
>> config ARM64_SVE
>> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
>> index 3d261cc123c1..fba2347c0aa6 100644
>> --- a/arch/arm64/include/asm/cpufeature.h
>> +++ b/arch/arm64/include/asm/cpufeature.h
>> @@ -879,6 +879,30 @@ static inline bool cpu_has_hw_af(void)
>> ID_AA64MMFR1_EL1_HAFDBS_SHIFT);
>> }
>>
>> +/*
>> + * Contrary to the page/block access flag, the table access flag
>> + * cannot be emulated in software (no access fault will occur).
>> + * So users should use this feature only if it's supported system
>> + * wide.
>> + */
>> +static inline bool system_support_haft(void)
>> +{
>> + unsigned int hafdbs;
>> + u64 mmfr1;
>> +
>> + if (!IS_ENABLED(CONFIG_ARM64_HAFT))
>> + return false;
>> +
>> + /*
>> + * Check the sanitised registers to see this feature is supported
>> + * on all the CPUs.
>> + */
>> + mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
>> + hafdbs = cpuid_feature_extract_unsigned_field(mmfr1,
>> + ID_AA64MMFR1_EL1_HAFDBS_SHIFT);
>> + return hafdbs >= ID_AA64MMFR1_EL1_HAFDBS_HAFT;
>> +}
>
> Can we not have just an entry in the arm64_features array with the type
> ARM64_CPUCAP_SYSTEM_FEATURE and avoid the explicit checks here?
>
reply below...
>> diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
>> index 8ff5f2a2579e..bc1051d65125 100644
>> --- a/arch/arm64/include/asm/pgalloc.h
>> +++ b/arch/arm64/include/asm/pgalloc.h
>> @@ -30,7 +30,7 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
>> {
>> pudval_t pudval = PUD_TYPE_TABLE;
>>
>> - pudval |= (mm == &init_mm) ? PUD_TABLE_UXN : PUD_TABLE_PXN;
>> + pudval |= (mm == &init_mm) ? PUD_TABLE_AF | PUD_TABLE_UXN : PUD_TABLE_PXN;
>> __pud_populate(pudp, __pa(pmdp), pudval);
>> }
>
> Why not set the table AF for the task entries? I haven't checked the
> core code but normally when we map a pte it's mapped as young. While for
> table AF we wouldn't get a fault, I would have thought the core code
> follows the same logic.
>
I may need to check. If I understand it correctly, for most case (e.g. a read fault) we should
make pte young if the hardware AF update is not supported. Otherwsie hardware will help to update.
>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>> index 718728a85430..6eeaaa80f6fe 100644
>> --- a/arch/arm64/kernel/cpufeature.c
>> +++ b/arch/arm64/kernel/cpufeature.c
>> @@ -2046,6 +2046,18 @@ static bool has_hw_dbm(const struct arm64_cpu_capabilities *cap,
>>
>> #endif
>>
>> +#if CONFIG_ARM64_HAFT
>> +
>> +static struct cpumask haft_cpus;
>> +
>> +static void cpu_enable_haft(struct arm64_cpu_capabilities const *cap)
>> +{
>> + if (has_cpuid_feature(cap, SCOPE_LOCAL_CPU))
>> + cpumask_set_cpu(smp_processor_id(), &haft_cpus);
>> +}
>> +
>> +#endif /* CONFIG_ARM64_HAFT */
>> +
>> #ifdef CONFIG_ARM64_AMU_EXTN
>>
>> /*
>> @@ -2590,6 +2602,17 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>> .cpus = &dbm_cpus,
>> ARM64_CPUID_FIELDS(ID_AA64MMFR1_EL1, HAFDBS, DBM)
>> },
>> +#endif
>> +#ifdef CONFIG_ARM64_HAFT
>> + {
>> + .desc = "Hardware managed Access Flag for Table Descriptor",
>> + .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
>
> I'd actually use ARM64_CPUCAP_SYSTEM_FEATURE here. We use something
> similar for HW DBM but there we get a fault and set the pte dirty. You
> combined it with a system_support_haft() that checks the sanitised regs
> but I'd rather have a static branch check via cpus_have_cap(). Even with
> your approach we can have a race with a late CPU hot-plugged that
> doesn't have the feature in the middle of some core code walking the
> page tables.
>
> With a system feature type, late CPUs not having the feature won't be
> brought online (if feature enabled) but in general I don't have much
> sympathy for SoC vendors combining CPUs with incompatible features ;).
>
ok. If we make it a system feature, we can using cpus_have_cap() then and
drop the system_support_haft() which is checking with the sanitised registers.
It's fine for me.
Will ask to not refuse online a CPU due to mismatch of this feature in [1],
hope we have an agreement :)
[1] https://lore.kernel.org/linux-arm-kernel/20240820161822.GC28750@willie-the-truck/
>> + .capability = ARM64_HAFT,
>> + .matches = has_cpuid_feature,
>> + .cpu_enable = cpu_enable_haft,
>> + .cpus = &haft_cpus,
>> + ARM64_CPUID_FIELDS(ID_AA64MMFR1_EL1, HAFDBS, HAFT)
>> + },
> [...]
>> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
>> index ccbae4525891..4a58b9b36eb6 100644
>> --- a/arch/arm64/mm/proc.S
>> +++ b/arch/arm64/mm/proc.S
>> @@ -495,9 +495,15 @@ alternative_else_nop_endif
>> * via capabilities.
>> */
>> mrs x9, ID_AA64MMFR1_EL1
>> - and x9, x9, ID_AA64MMFR1_EL1_HAFDBS_MASK
>> + ubfx x9, x9, ID_AA64MMFR1_EL1_HAFDBS_SHIFT, #4
>> cbz x9, 1f
>> orr tcr, tcr, #TCR_HA // hardware Access flag update
>> +
>> +#ifdef CONFIG_ARM64_HAFT
>> + cmp x9, ID_AA64MMFR1_EL1_HAFDBS_HAFT
>> + b.lt 1f
>> + orr tcr2, tcr2, TCR2_EL1x_HAFT
>> +#endif /* CONFIG_ARM64_HAFT */
>
> I think we can skip the ID check here and always set the HAFT bit. We do
> something similar with MTE (not for TCR_HA though, don't remember why).
>
Thanks for the reference to MTE. Will see and have a test. But a check here
may seem more reasonable as we usually detect a feature first then enable it?
Thanks.
More information about the linux-arm-kernel
mailing list