[PATCH v2] arm64: Avoid repeated AA64MMFR1_EL1 register read on pagefault path

Gabriel Krisman Bertazi krisman at suse.de
Wed Jan 18 08:06:42 PST 2023


Catalin Marinas <catalin.marinas at arm.com> writes:

> On Mon, Jan 09, 2023 at 12:19:55PM -0300, Gabriel Krisman Bertazi wrote:
>> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
>> index 03d1c9d7af82..3d1a5677321e 100644
>> --- a/arch/arm64/include/asm/cpufeature.h
>> +++ b/arch/arm64/include/asm/cpufeature.h
>> @@ -864,7 +864,11 @@ static inline bool cpu_has_hw_af(void)
>>  	if (!IS_ENABLED(CONFIG_ARM64_HW_AFDBM))
>>  		return false;
>>  
>> -	mmfr1 = read_cpuid(ID_AA64MMFR1_EL1);
>> +	/*
>> +	 * Use cached version to avoid emulated msr operation on KVM
>> +	 * guests.
>> +	 */
>> +	mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
>>  	return cpuid_feature_extract_unsigned_field(mmfr1,
>>  						ID_AA64MMFR1_EL1_HAFDBS_SHIFT);
>>  }
>
> You can probably make it slightly faster if you add a cpucap feature. We
> have ARM64_HW_DBM but not ARM64_HW_AF. Not sure you'd notice a
> difference in practice though.

Hi Catalin,

I had a patch for that, but when I noticed the existence of
ARM64_HW_DBM, I dropped it. it is mentioned quickly in the v1.

>From my measurements, when comparing this version to a static variable
caching the value (patch v1), the bsearch performance impact was not
significant.  I think it would be fine to keep this version.

but, of course, I can add the cpucap bit if it is preferred.  Please,
let me know.

Thanks,

-- 
Gabriel Krisman Bertazi



More information about the linux-arm-kernel mailing list