[PATCH 15/16] arm64: Delay enabling hardware DBM feature

Suzuki K Poulose Suzuki.Poulose at arm.com
Fri Jan 26 08:05:24 PST 2018


On 26/01/18 14:41, Dave Martin wrote:
> On Tue, Jan 23, 2018 at 12:28:08PM +0000, Suzuki K Poulose wrote:
>> We enable hardware DBM bit in a capable CPU, very early in the
>> boot via __cpu_setup. This doesn't give us a flexibility of
>> optionally disable the feature, as the clearing the bit
>> is a bit costly as the TLB can cache the settings. Instead,
>> we delay enabling the feature until the CPU is brought up
>> into the kernel. We use the feature capability mechanism
>> to handle it.
>>
>> The hardware DBM is a non-conflicting feature. i.e, the kernel
>> can safely run with a mix of CPUs with some using the feature
>> and the others don't. So, it is safe for a late CPU to have
>> this capability and enable it, even if the active CPUs don't.
>>
>> To get this handled properly by the infrastructure, we
>> unconditionally set the capability and only enable it
>> on CPUs which really have the feature. Adds a new type
>> of feature to the capability infrastructure which
>> ignores the conflict in a late CPU.
>>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose at arm.com>
>> ---
>>   arch/arm64/include/asm/cpucaps.h    |  3 ++-
>>   arch/arm64/include/asm/cpufeature.h |  8 +++++++
>>   arch/arm64/kernel/cpufeature.c      | 42 +++++++++++++++++++++++++++++++++++++
>>   arch/arm64/mm/proc.S                |  5 +----
>>   4 files changed, 53 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
>> index bb263820de13..8df80cc828ac 100644
>> --- a/arch/arm64/include/asm/cpucaps.h
>> +++ b/arch/arm64/include/asm/cpucaps.h
>> @@ -45,7 +45,8 @@
>>   #define ARM64_HARDEN_BRANCH_PREDICTOR		24
>>   #define ARM64_HARDEN_BP_POST_GUEST_EXIT		25
>>   #define ARM64_HAS_RAS_EXTN			26
>> +#define ARM64_HW_DBM				27
>>   
>> -#define ARM64_NCAPS				27
>> +#define ARM64_NCAPS				28
>>   
>>   #endif /* __ASM_CPUCAPS_H */
>> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
>> index 70712de687c7..243ec7c77c79 100644
>> --- a/arch/arm64/include/asm/cpufeature.h
>> +++ b/arch/arm64/include/asm/cpufeature.h
>> @@ -126,6 +126,14 @@ extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0;
>>    */
>>   #define ARM64_CPUCAP_STRICT_CPU_LOCAL_FEATURE	\
>>   	(ARM64_CPUCAP_SCOPE_LOCAL_CPU | ARM64_CPUCAP_LATE_CPU_SAFE_TO_MISS)
>> +/*
>> + * CPU feature detected on each local CPU. It is safe for a late CPU to
>> + * either have it or not.
>> + */
>> +#define ARM64_CPUCAP_WEAK_CPU_LOCAL_FEATURE	 \
>> +	(ARM64_CPUCAP_SCOPE_LOCAL_CPU		|\
>> +	 ARM64_CPUCAP_LATE_CPU_SAFE_TO_MISS	|\
>> +	 ARM64_CPUCAP_LATE_CPU_SAFE_TO_HAVE)
> 
> OK, so this is similar to my suggestion for HAS_NO_HW_PREFETCH (though
> that need not have the same answer -- I was speculating there).

Yes, I was under the assumption that HAS_NO_HW_PREFETCH is treated as
a "Late CPU can't have the capability" type, hence the "STRICT_CPU_LOCAL",
as we can't apply work-arounds anymore for this CPU. However, since
we only suffer a performance impact, we could as well convert it to
a WEAK one.

> 
> Nit: tab between | and \?

Sure.

> 
>>   struct arm64_cpu_capabilities {
>>   	const char *desc;
>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>> index 2627a836e99d..8af755b8219d 100644
>> --- a/arch/arm64/kernel/cpufeature.c
>> +++ b/arch/arm64/kernel/cpufeature.c
>> @@ -894,6 +894,35 @@ static int __init parse_kpti(char *str)
>>   __setup("kpti=", parse_kpti);
>>   #endif	/* CONFIG_UNMAP_KERNEL_AT_EL0 */
>>   
>> +#ifdef CONFIG_ARM64_HW_AFDBM
>> +static bool has_hw_dbm(const struct arm64_cpu_capabilities *entry, int scope)
>> +{
>> +	/*
>> +	 * DBM is a non-conflicting feature. i.e, the kernel can safely run
>> +	 * a mix of CPUs with and without the feature. So, we unconditionally
>> +	 * enable the capability to allow any late CPU to use the feature.
>> +	 * We only enable the control bits on the CPU, if it actually supports.
>> +	 */
>> +	return true;
>> +}
>> +
>> +static inline void __cpu_enable_hw_dbm(void)
>> +{
>> +	u64 tcr = read_sysreg(tcr_el1) | TCR_HD;
>> +
>> +	write_sysreg(tcr, tcr_el1);
>> +	isb();
> 
> Do we need this isb?  Do we care exactly when setting TCR_HD appears
> to take effect?

Practically no, as it doesn't matter if we use it or not. But, since the
CPU is anyway booting, there is no harm in enforcing it to take effect.

> 
>> +}
>> +
>> +static int cpu_enable_hw_dbm(struct arm64_cpu_capabilities const *cap)
>> +{
>> +	if (has_cpuid_feature(cap, SCOPE_LOCAL_CPU))
>> +		__cpu_enable_hw_dbm();
>> +
>> +	return 0;
>> +}
>> +#endif
>> +
>>   static int cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
>>   {
>>   	/*
>> @@ -1052,6 +1081,19 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>>   		.enable = cpu_clear_disr,
>>   	},
>>   #endif /* CONFIG_ARM64_RAS_EXTN */
>> +#ifdef CONFIG_ARM64_HW_AFDBM
>> +	{
>> +		.desc = "Hardware pagetable Dirty Bit Management",
>> +		.type = ARM64_CPUCAP_WEAK_CPU_LOCAL_FEATURE,
>> +		.capability = ARM64_HW_DBM,
>> +		.sys_reg = SYS_ID_AA64MMFR1_EL1,
>> +		.sign = FTR_UNSIGNED,
>> +		.field_pos = ID_AA64MMFR1_HADBS_SHIFT,
>> +		.min_field_value = 2,
>> +		.matches = has_hw_dbm,
> 
> Can't we use has_cpuid_feature here?  Why do we need a fake .matches and
> then code the check manually in the enable mathod?

We could, but then we need to add another *type*, where capabilities could
be enabled by a late CPU, where something is not already enabled by the boot-time
CPUs. i.e, if we boot a DBM capable CPU late, we won't be able to use the feature
on it, with the current setup. I didn't want to complicate the infrastructure
further just for this.

> I may be missing something here.

Cheers
Suzuki



More information about the linux-arm-kernel mailing list